Infosets a Horror Story? ( was Re: Article:"The horror of XML"
10/31/2002 7:36:04 AM, Elliotte Rusty Harold <elharo@m...> wrote: A good topic for Samhain (aka Halloween), when the boundary between the worlds of -- of life and death, syntax and data models -- becomes blurred :-) >It's the infoset's fault that it doesn't mandate simple >well-formedness. I have no objection to synthetic infosets or >non-text, internal representations of the Infoset such a DOM Document >object. I object when those representations do not adhere to the same >basic rules XML 1.0 does. Uhh, "This specification defines an abstract data set called the XML Information Set (Infoset). Its purpose is to provide a consistent set of definitions for use in other specifications that need to refer to the information in a well- formed XML document." I fully agree that it would be nice for Someone (I despair of this being the W3C) to formally describe the implicit data model in XML. The trouble is, some people deny that it exists, and most people who try to take a stab at this hit quicksand quickly. ("Can there be adjacent Text nodes? What about CData sections, unexpanded entity references, and other syntax sugar?) Then there are Namespaces, whose Giant [expletive deleted] Sound scares away all but the bravest explorers of this space. Then there's the "PSVI" stuff (even XML 1.0 constructs such as attribute types and default attribute values arguably are part of the PSVI). Not to mention XInclude and the lack of a common processing model saying when it is applied! Since everyone who looks at this comes up with a different answer, there's no answer that will satisfy everyone (one reason the Infoset spec is so, uhh, non-directive I believe). I personally (taking all my hats off!!) think that a single data model ought to be described and what we call "XML" redefined on top of that single data model. Syntax sugar is fine, but it probably ought to be resolved in a pre-parser akin to the C preprocessor that produces a canonical syntax that could be the basis for true interoperability at the syntax level. Parsers (of this canonical syntax or of any number of "little languages" and alternate syntaxes) that produce data structures that logically conform to the single data model could be considered to be "XML", and processed with XSLT, queried with XQuery, passed around via SOAP, etc. All this is not going to happen until the cruft overwhelms us, and so far people have dealt with the cruft by ad hoc profiles (e.g., the one SOAP uses) and implicit agreement on what the specs really mean. We shall see if that suffices in the long run.
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format