|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Why the Infoset?
----- Original Message ----- From: "Paul W. Abrahams" <abrahams@v...> To: "XMLDev list" <xml-dev@l...> Sent: Friday, July 28, 2000 8:16 PM Subject: Re: Why the Infoset? > Viewed as an elegant description of the information contained in an XML > document, the Infoset make sense. But unlike the other XML specs, its > normative effect is unclear. If I'm implementing an XML-related processor of > any variety, what does the Infoset require me to do that I would not have to do > if the Infoset never existed? It answers questions that are irrelevant when XML is viewed as a syntax, but quite important to users of the DOM, XPath, XSL, etc. that operate on some representation of a more abstract parsed XML document. For example, the XML spec says that "<empty></empty>" and "<empty/>" are both well formed XML elements, but nothing about whether they are equivalent. Infoset says (or at least the previous draft did) that they are. Likewise, as was pointed out earlier, InfoSet says that certain well-formed XML elements such as "<ns::foo>blah</ns::foo>" do NOT have an unambiguous internal representation. Without the InfoSet, it would be unclear if this is an element named "foo" with a namespace prefix "ns", an element "foo" with a prefix "ns:", or an element named "ns::foo". [OK, so shoot me if I've got a detail wrong here ... I'm trying to illustrate the general point ;~) ] The lack of an InfoSet certainly made it much harder to invent the Level 1 DOM; it simply was not clear (and was highly contentious) whether expanded entity references remained in the XML document tree or not... and how mixed content would be represented in the tree. Once the DOM and XPath were invented, subtle differences emerged in their conceptions of what an abstract representation of an XML document looks like ... and there's always the "groves" model that underlies the HyTime and DSSSL specs that provides yet another perspective on what an abstract SGML document looks like. While I personally fear that InfoSet [again, previous drafts anyway] papers over these differences rather than clearly specifies a single model, it definitely provides a much clearer notion of what an XML parser produces, and what an XML API or transformer operates on, than would exist in its absence. So, one fairly practical normative question it *does* answer would be: 'My application would like to treat "<emtpy></empty>" as signifying "data will the value NULL" and "<empty/>" as signifying "no data". Can I do this in a environment where the XML will be processed by various tools that implement the XML specs but that I do not control?' The answer, for better or worse, is NO - an XML processor is under no obligation to preserve this distinction. That answer comes from the InfoSet ... not the XML spec, the DOM, XSLT, etc.
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








