[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] SAX needs from our point of view
Michael Amster writes: > In our case, having embedded XML languages with our own language > controlling flow of execution, we have a real need for an accurate > reproduction of the XML elements parsed so they can be rewritten > correctly. SAX reports all elements, together with character data, ignorable whitespace, and processing instructions, so you won't lose anything there. > Specifically, the issue is important in distinguishing between text and > CDATA. Let me illustrate with a simple example: > > <WEIF COND="true"> > <WETHEN> > <ARBITRARYXML/> > <![CDATA[ > This is data with &references; which should not be parsed! > ]]> > <MOREXML> > This is just text > </MOREXML> > </WETHEN> > </WEIF> > > When this is reported up from a SAX parser, we do not differentiate between > text and the CDATA, but let's say that we want to output the subset of > arbitrary XML back out from our DOM or other object structure: > > <ARBITRARYXML/> > This is data with &references; which should not be parsed! > <MOREXML> > This is just text > </MOREXML> Your output routine is wrong: it should automatically escape all instances of '&', '<', and '>': <ARBITRARYXML/> This is data with &references; which should not be parsed! <MOREXML> This is just text </MOREXML> or even <ARBITRARYXML/> This is data with &references; which should not be parsed! <MOREXML> This is just text </MOREXML> > Now you see that the CDATA will have all references made when it is > reparsed. We really do want to preserve CDATA as different from > text in SAX. If there's a semantic attached to your use of CDATA, you should represent it with an element (which is guaranteed to make it through processing): <listing><![CDATA[ Here is a listing: 1 < 2 ]]></listing> <listing> Here is a listing: 1 < 2 </listing> There is no need for general XML processing tools _ever_ to know about CDATA sections; authoring and repository tools (including tools for authoring transforms) might want preserve them, but those fall out of the target audience for SAX level 1. Think of the analogy of C: the preprocessor takes care of surface things like macros and hides them from the compiler, which produces exactly the same object code for #define FOO 1 printf("%d", FOO + FOO); and printf("%d", 1 + 1); All the best, and thanks for the comments, David -- David Megginson ak117@f... Microstar Software Ltd. dmeggins@m... http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|