[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: XML Binary Characterization WG public list available


characterization words


> -----Original Message-----
> From: Elliotte Rusty Harold [mailto:elharo@m...] 
> Sent: Monday, April 12, 2004 19:06
> To: bob@w...
> Cc: 'Michael Champion'; ''xml-dev' DEV'
> Subject: RE:  XML Binary Characterization WG public 
> list available
> 
> 
> At 4:59 PM -0400 4/12/04, Bob Wyman wrote:
> >Elliot Rusty Harold wrote:
> >>  DOM and SAX at least can handle documents that do
> >>  not have infosets, and I think XPath/XSLT can too
> >	What does it mean to say that an XML document "does not have an 
> >infoset"? I'm a bit puzzled by the words here. I thought 
> that any XML 
> >document could be described via an Infoset. What am I missing? (My 
> >apologies if this is a silly question or if I've missed 
> something very 
> >fundamental in the XML specs...)
> 
> They're at least two such cases that exist, as John Cowan 
> pointed out before:
> 
> 1. Documents that are namespace malformed
> 2. Namespace well-formed documents that use relative namespace URIs
> 
> I've encountered both of these in practice. There may be other cases, 
> but these (or more specifically case 1) were what I was thinking of 
> here.
> 
> The problem in the reverse direction is much larger though. There are 
> many, many infosets that do not correspond to well-formed XML 1.0 
> documents. This has been a major problem for various specifications 
> based on the Infoset including XInclude. By starting with the 
> infoset, rather than XML itself, technologies tend to lose track of 
> some critical rules like "element names may not contain white space" 
> or "an element may not have two attributes with the same name." These 
> rules are not enforced in the infoset, only in real XML.


The view taken in developing the "fast infoset" standard in ISO/ITU-T is to
consider the subset of infosets that have an XML representation and the
subset of XML documents that have an infoset.  Further simplifications are
done, therefore the actual subset is smaller than that.  The objective is to
cover a very large number of infosets / XML documents that are of practical
interest, not to cover all possible XML documents and all possible infosets.

(Of course, some people will have a different opinion on what XML documents
are of practical interest.)

By doing so, we will have an alternative representation of an XML document,
that is both more compact and faster to parse and create, but that can be
easily converted back and forth from/to XML.  For all the XML documents that
are in the "subset" mentioned above, conversion is lossless, character-wise,
except for such things as whitespace inside tags and quote-apostrophe stuff.

By the way, we are not calling this thing "XML something" or "binary XML".
We are calling it "fast infoset".

The specification is written in terms of the infoset, by providing an ASN.1
definition for each information item and item property, with some
simplifications (as mentioned above).

I hope you will agree that such a thing can be useful to many people,
although it is not XML, of course.

Alessandro Triglia
OSS Nokalva


> 
> The mapping between infosets and XML documents is neither 1-1 nor 
> onto, even when lexical issues like white space inside tags is 
> ignored.
> 
> Remember, despite what people keep saying the infoset is *NOT* a data 
> model for XML. It is *NOT* a replacement for the XPath data model, 
> the DOM data model, or any other data model.
> 
> -- 
> 
>    Elliotte Rusty Harold
>    elharo@m...
>    Effective XML (Addison-Wesley, 2003)
>    http://www.cafeconleche.org/books/effectivexml
>    
> http://www.amazon.com/exec/obidos/ISBN%3D0321150406/ref%3Dnosi
m/cafeaulaitA

-----------------------------------------------------------------
The xml-dev list is sponsored by XML.org <http://www.xml.org>, an initiative
of OASIS <http://www.oasis-open.org>

The list archives are at http://lists.xml.org/archives/xml-dev/

To subscribe or unsubscribe from this list use the subscription
manager: <http://www.oasis-open.org/mlmanage/index.php>



PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.