[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: XML's Scylla and Charybdis - parse and regexp


html regexp
> dareo@m... (Dare Obasanjo) writes:
> >In fact, i'd go as far as to say that applications that 
> >depend on full lexical round tripping (i.e. preserving all whitespace, 
> >treating attribute order as significant, etc) are in violation of the 
> >spirit if not the letter of the XML 1.0 recommendation. 
> 
> Given that you have divined a data model for XML that isn't actually
> specified as the XML data model anywhere, I'm pretty wary of your
> interpretations of the spirit of XML 1.0.
> 
> The XML 1.0 recommendation does say: "Note that the order of attribute
> specifications in a start-tag or empty-element tag is not significant."
> It does not say "parsers must discard the order and scramble the
> attributes."

Hear hear.


> >However I agree there are edge cases such as XML editors where such 
> >behavior is not just desirable but required. 
> 
> There seems to have been a movement early on, especially with the DOM,
> to chop out "editor-only" functionality.  I'm not sure that was such a
> brilliant move in retrospect.

Ha.  Who needs retrospect.  As I recall, the danger here was clearly discussed 
even while DOM Level 1 was in progress.  The argument that allowed these 
issues to be set aside was always "we'll deal with it in another DOM level".  
Of course that never happened.

This whole thread has been a sort of möbius strip for me.  I find the idea of 
regexen for general and primary XML processing apalling, yet I agree with many 
regexen boosters that using XML APIs such as SAX and DOM too often mangles XML 
documents at the lexical level.  I find the idea of "The XML Data Model"(TM) a 
laughable fiction and yet I feel it is vey important for the most common tool 
sets to operate on XML models such as SAX and XPath which do omit lexical 
fidelity.

I must say that this thread does give me very interesting ideas about 
next-generation XML processing models in Python, so I gues I'm grateful.


-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Use internal references in XML vocabularies - http://www-106.ibm.com/developerworks/xml/library/x-tipvocab.html
Universal Business Language (UBL) - http://www-106.ibm.com/developerworks/xml/library/x-think16.html
EXSLT by example - http://www-106.ibm.com/developerworks/library/x-exslt.html
The worry about program wizards - http://www.adtmag.com/article.asp?id=7238
Use rdf:about and rdf:ID effectively in RDF/XML - http://www-106.ibm.com/developerworks/xml/library/x-tiprdfai.html
Keep context straight in XSLT - http://www-106.ibm.com/developerworks/xml/library/x-tipcurrent.html
Using SAX for Proper XML Output - http://www.xml.com/pub/a/2003/03/12/py-xml.html
SAX filters for flexible processing - http://www-106.ibm.com/developerworks/xml/library/x-tipsaxflex.html



PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.