[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Infosets a Horror Story? ( was Re: Article:"The horror of XML"

horror story
10/31/2002 7:36:04 AM, Elliotte Rusty Harold <elharo@m...> wrote:

A good topic for Samhain (aka Halloween), when the boundary between 
the worlds of -- of life and death, syntax and data models --  becomes blurred :-)

>It's the infoset's fault that it doesn't mandate simple
>well-formedness. I have no objection to synthetic infosets or
>non-text, internal representations of the Infoset such a DOM Document
>object. I object when those representations do not adhere to the same
>basic rules XML 1.0 does.

Uhh, "This specification defines an abstract data set called the 
XML Information Set (Infoset). Its purpose is to provide a 
consistent set of definitions for use in other specifications
that need to refer to the information in a well- formed XML document."

I fully agree that it would be nice for Someone (I despair of this
being the W3C) to formally describe the implicit data model in
XML.  The trouble is, some people deny that it exists, and most people who
try to take a stab at this hit quicksand quickly.  ("Can there
be adjacent Text nodes?  What about CData sections, unexpanded
entity references, and other syntax sugar?)  Then there are Namespaces,
whose Giant [expletive deleted] Sound scares away all but the bravest
explorers of this space. Then there's the "PSVI" stuff (even XML 1.0
constructs such as attribute types and default attribute values
arguably are part of the PSVI).  Not to mention XInclude and the
lack of a common processing model saying when it is applied!

Since everyone who looks at this comes up with a different answer,
there's no answer that will satisfy everyone (one reason the Infoset 
spec is so, uhh, non-directive I believe).

I personally (taking all my hats off!!) think that a single data model ought
to be described and what we call "XML" redefined on top of that single
data model.  Syntax sugar is fine, but it probably ought to be resolved in
a pre-parser akin to the C preprocessor that produces a canonical 
syntax that could be the basis for true interoperability at the syntax
level.  Parsers (of this canonical syntax or of any number of "little languages"
and alternate syntaxes) that produce data structures that logically conform to the
single data model could be considered to be "XML", and processed with
XSLT, queried with XQuery, passed around via SOAP, etc.

All this is not going to happen until the cruft overwhelms us, and
so far people have dealt with the cruft by ad hoc profiles (e.g., the
one SOAP uses) and implicit agreement on what the specs really mean.  
We shall see if that suffices in the long run.  


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.