|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: CDATA sections in W3C XML Infoset
Bob Kline wrote: > No? We have quite a bit of code in our XML repository which uses XML > commands over sockets for its client-server interface to the rest of the > world. Most of the commands embed an XML document being stored in or > retrieved from the repository. The embedded documents are wrapped in > CDATA sections. And when the embedded document already contains a CDATA section? Bzzzzt, not well-formed. > The logic for extracting a document from an incoming > client command is essentially: > > Find the element containing the CDATA section. > Find the CDATA child of the element. > Hand the value of the CDATA section to the parser. I admit this is an easy DOM-based hack. But it shouldn't be *that* much harder to know what element you are looking for, pull out a Text child (initially there should be only one, or you can normalize), and do the conversion below. > Before you even think about suggesting how easy it would be to restore > the angle brackets in the embedded document, let me point out that the > < and > which are not delimiters for the element tags in the > embedded document cannot be "restored" to < and >, and I submit that it > is impossible in some cases to distinguish which those were. Therefore > information has been lost. Not so if you encode properly. By changing every "&" in the embedded document to "&" and every "<" to "<" (conceptually in that order), you get this result: Original Embedding < < & & < &lt; & &amp; &lt; &amp;lt Etc. etc. No information is lost: change every "<" to "<" and every "&" to "&" (conceptually in that order) and the exact original is restored. In this encoding, ">" characters need not be changed. > Before you suggest that the embedded document should not have been > wrapped in a CDATA section in the first place, let me say that: [points snipped] These points basically say that your embedded documents are text, not necessarily XML. The safe way to encode text in an XML document is to use the mapping above. -- There is / one art || John Cowan <jcowan@r...> no more / no less || http://www.reutershealth.com to do / all things || http://www.ccil.org/~cowan with art- / lessness \\ -- Piet Hein
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








