[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Simplifying SRFI 109, part 1: entities

This page is part of the web mail archives of SRFI 109 from before July 7th, 2015. The new archives for SRFI 109 contain all messages, not just those from before July 7th, 2015.



On 03/30/2013 10:57 PM, John Cowan wrote:
Per Bothner scripsit:

It is more important to preserve the XML conceptual "information
model",

Absolutely, but SRFI 107 models only a part of the XML Infoset
<http://www.w3.org/TR/xml-infoset>, of which (ahem) I was the principal
editor.

Yes, I actually consulted that today and noticed the "coincidence"
of the name.  Glad to confirm my assumption they're one and the same.
(I also know a John Cowan socially, but he doesn't appear to be you ...)

SRFI 107 as currently written does not support the concept of an
XML document - whether we mean:  (1) XML document as a file format.
(2) DOM Document as a data type for representing the "significant"
information

It's the second concept I mean.

SRFI-107 doesn't directly support either.  I think APIs supporting
both are desirable - and SRFI-107 should hopefully work well with such
APIs.  However, this process has dragged on long enough, and working
with documents seems like new functionality that I think should be
saved for future work.

In that case, perhaps this SRFI should be renamed "XML element reader
syntax".

First, this SRFI also has a reader syntax for PI nodes, comment nodes,
and CDATA nodes.

There is no support for (top-level) attribute nodes, though you can
write them with a $xml-attribute$ constructor.

My assumption is a Scheme API for XML would have standard
function calls for creating DOM Nodes, and the SRFI-107
syntax would be basically syntactic sugar.

Some other languages also provide similar XML "literals".
My goal is not to embed XML documents inside a program,
but to provide a more familiar syntax equivalent to node
constructor expressions.  XQuery has equivalent functionality
as this proposal: XML-style syntax for elements, PI, and
comments.  For attributes and documents you need to use
"computed constructors".  "EcmaScript for XML" is similar.
Visual Basic does support "XML Document literals", and
I guess we can add support for them.  Perhaps we can allow:

  #<?xml optional-stuff?><root>...</root>

and/or:

  #<!DOCTYPE root optional-stuff><root>...</root>

And of course we can use SRFI1-108 in some way.

However, doing this "right" adds a fair bit of extra
work and complexity.  It also pushes the limits of my
expertise.  The obvious reader translation is to an
$xml-document$ constructor.  Beyond that things are less
obvious.  Because the infoset for an XML document is rather
complex, it seems cleaner to use keywords - but alas keywords
are far from standard: We get into the question of
whether to use SRFI-88-style keywords or plain symbols.
I like SRFI-88-style keywords of course (Kawa has them),
but obviously this limits portability.  (Though if we
add them to R7RS-large I'd have less reluctance ...)

If you think the name "XML reader syntax" is misleading,
we could change the name to "Basic XML reader syntax" by
analogy with SRFI-28.

Indeed, that is rather vague - and raises these questions:  (1) What are
"standard Scheme character names"? I suggest going with the R7RS names.

I'm happy with that, of course.

(2) When it comes to an implementation supporting "standard Scheme
character names", is this a "must" or a "should"?  I could go either
way.  (3) Do we want a different answer for SRFI-17 and SRFI-108/-109?
I'd prefer not.

I'd prefer a SHOULD for all three SRFIs.

Updated in: http://per.bothner.com/tmp/srfi-109/srfi-109.html
--
	--Per Bothner
per@xxxxxxxxxxx   http://per.bothner.com/