[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Why the Infoset?

  • From: Sean McGrath <sean@d...>
  • To: Elliotte Rusty Harold <elharo@m...>, xml-dev@l...
  • Date: Thu, 03 Aug 2000 19:15:51 +0100

Re: Why the Infoset?
>>This is the sort of "partical physics" I think we need
>>beneath XML 1.
>>
>
[Elliotte Rusty Harold]
>But there is a particle physics beneath the InfoSet that applications 
>can use if they like. It's called the stream. The particles are 
>bytes.  That may seem a little too fundamental to you, and you may 
>want something a little higher level. OK. But all we're doing here is 
>arguing about which layers of abstraction are useful.
>

The W3C infoset work seems to be to bless two levels
of abstraction:
	a) XML entities are a stream of bytes
       b) XML entities consist of elements,attributes,data ...
       (all the stuff in the Infoset doc)

I see these two as being on opposite sides of a spectrum.
I see two other interesting foci on that spectrum:


      bytes    tokens           infoset  uber-infoset
	(a) ------(X)------------- (b)--------(Y)

(a) is comprehensive but working at this level involves
parsing XML constructes from scratch. This is a lot of
work as anyone who has ever written an XML parser will
tell you.

(b) is convenient for a broad class of applications but lossy.
Certain stuff is not visisble. The stuff that is not visible
is lost if the application round-trips back to XML.

(X) This is the space where what SGML called
"markup sensitive" apps. live. Apps that care about the
difference between "Hello world" and "&greeting;".
Apps that care about default attribute values etc. etc.

(Y) This is the space where high fidelity roundtripping apps
live. Apps that care about the difference between:
	<name first = "Sean" last = "Mc Grath"/>
and:
	<name
		last = 'Mc Grath'
		first = 'Sean'></name>

(b) which is where the W3C infoset lives. It seems to me to
be closest to what SGML called "structure controlled" apps.

I am worried that by blessing a single infoset, the W3C are
leaving big holes in areas (x) and (y) where a lot of
important XML data processing goes on.

There needs to be N infosets (N > 1) to cover
the range of application types people build with
XML.

How that comes to pass remains to be seen. For now,
I would be delighted if the W3C simply *renamed* the
infoset to be something more familial like the "structure
controlled XML infoset" so that it is obvious to readers,
where in the spectrum of possible XML infosets in lives. 

regards,


http://www.pyxie.org - an Open Source XML Processing library for Python


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.