Re: RE: heritage (was Re: SGML on the Web)
10/8/2002 9:31:23 AM, "Michael Kay" <michael.h.kay@n...> wrote: have an underlying data model. > >The problem is that it should have an underlying model, but it hasn't: >it only has a "overlying" model (the InfoSet) that is retrofitted to the >syntax. The fact that the model is retrofitted rather than being a >normative part of XML means that questions like "are comments >significant" have never been satisfactorily answered. Even the new >versions of the specs (XML 1.1 and Namespaces 1.1) do not refer >normatively to the InfoSet, so these questions remain debateable. And >the confusion over marginally-significant stuff like CDATA sections, >namespace prefixes, and inter-element whitespace continues to cause >interoperability nightmares. If people had defined the model before >defining the syntax we wouldn't be in this mess. I completely agree. The DOM implicit data model tries to be inclusive in exposing "syntax sugar" because it was driven by the requirements of editor vendors who need to expose that level of control. "Overlying" data model is a good description of the infoset, which is designed more to describe what parsers produce than to prescribe what syntax should be significant. The XPath/XSLT data model was the first to start the job of triage on syntax sugar that should dissolve when parsed, and since its data model is read-only, it doesn't have to worry about round-tripping the way the DOM data model does. This is a mess, indeed. I think, however, that the reason we are in this mess is there is a "heritage" in SGML, carried over in SAX, and now in LMNL, that markup really is Just Syntax, and data models are something for the application to define. That's not a problem per se -- obviously lots of people get real work done in that paradigm -- just that it doesn't fit into the world of Dynamic HTML scripters, generic XML authoring tools, generic XML transformation languages, generic XML DBMS systems, etc. A DBMS has to take a stand on whether entities are expanded or undexpanded before indexing; it has to decide whether to preserve CDATA sections and comments, etc. So, I can agree that "if people had defined the model before delivering the syntax" then WE (the generic data model-oriented subculture) wouldn't be in this mess, but then the "it's just syntax" people wouldn't have come along on the XML parade.
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format