|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: Syntax Sugar and XML information models
DTDs are missing from the InfoSet. They are probably the most useful 'missing' item. -Wayne Steele >From: Michael Champion <mike.champion@s...> >To: xml-dev <xml-dev@l...> >Subject: RE: Syntax Sugar and XML information models >Date: Wed, 28 Mar 2001 21:13:01 -0500 > > > > > > Conceptually, perhaps we have: > > > > > > The "Syntax Sugar InfoSet" (SSIS) that exposes everything worth > > > round-tripping > > > in the XML syntax... [even different quote characters > > and whitespace???] > > > > That list could be endless - you did not even mention attribute order. > >Well, that's the nub of the issue here: The W3C InfoSet is widely >interpreted as decreeing that everything not in the InfoSet is "mere syntax >sugar". Some of these distinctions are clearly rooted in the XML spec and >existing practice, such as the fact that the order of attributes is >insignificant, the type of quotation marks around attribute values is >insignificant, etc. Others are more controversial, such as CDATA sections. >[For example, would you really want your XML database to take in XML >documents with scripts escaped with CDATA sections and return them escaped >with < etc.?] >Others really MUST be interpreted differently by authoring tools than the >InfoSet specifies -- for example, the whole POINT of parsed entities is >lost >if an editor doesn't round-trip them; likewise a database should either let >its client resolve external entities, or resolve them at retrieval time >rather than storage time. (Entities are the only thing supported in a >Recommendation that enable control of redundant information ...). > >So, there seem to be two classes of things that the InfoSet doesn't cover: >the "mere syntax" that no reasonable application (except maybe a "diff") >would care about, and the gray area stuff that some XML tools must care >about but that the InfoSet says nothing about. My suggestion is to make >this distinction more >formally, based on input from the folks "in the trenches" about which >details of XML syntax are "significant" and which aren't. Maybe there is >an >endless list of things that some people care about and some don't, but I'd >at least like to see some discussion before giving up. > >So, does ANYBODY care about round-tripping a) the specific quote characters >around attribute values, b) the order of attributes; c) character entity >references for characters that are in the specified character set d) the >two >diferent syntaxes for empty elements, .... ? Are there other bits that the >InfoSet doesn't represent but have some practical significance to real >applications? (Let's not discuss whitespace ... the complexities there are >well-known and too painful to think about). > _________________________________________________________________ Get your FREE download of MSN Explorer at http://explorer.msn.com
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








