[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Why the Infoset?

  • From: Sean McGrath <sean@d...>
  • To: xml-dev@x...
  • Date: Tue, 01 Aug 2000 15:44:20 +0100

Re: Why the Infoset?
>John Cowan wrote:
>>  Character references are lost, it is true.
>> If you want them back, shout now.
>
At 21:56 01/08/00 +0800, Rick JELLIFFE wrote:
>Can I shout the opposite: "the fact that a character was entered
>directly or by reference should not be information available for any
>other specification or general-purpose application: it should not be
>part of the infoset."
>
>This is because the use of character references should be determined by
>its availability in the encoding used (and any user-supplied "kernel"
>encoding within that).  XML should be defined using Unicode characters,
>not the markup that achieved the character.
>
Can I shout the opposite to this opposite!

This is a good case in point where the in/not-in dualism of the
OTI (One True Infoset) approach falls down. If character references
are not in the infoset then it is impossible to
write an XML parser based app that processes them.

The only way to process them would be to do so *lexically*.
In shifting to a lexical based algorithm you would need to
basically *re-write* an XML parser in order to be sure
that you were identifying character entity references correctly
every time.

Oh, sure you can write a regexp that will work "most of the time" but
try tell that to the client of the m-commerce/healthcare/rocket launching
XML application your are building.

regards,
Sean

http://www.pyxie.org - an Open Source XML Processing library for Python


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.