[Home] [By Thread] [By Date] [Recent Entries]

  • From: "Michael Kay" <mike@s...>
  • To: "'Karr, David'" <david.karr@w...>,<xml-dev@l...>
  • Date: Mon, 29 Sep 2008 16:22:34 +0100

Title: Tradeoffs of XML encoding by enclosing all content in CDATA blocks
The best argument is that people who adopt this approach usually fail to check for the presence of "]]>", which isn't allowed in CDATA sections. Once you start checking for that and dealing with it properly, it turns out to be easier to check for & and < and escape them as &_amp; and &_lt; respectively. (Underscores inserted to prevent misformatting).
 
Also, the code for escaping & and < works for both elements and attributes (though attributes also need some attention to look for quotes), whereas the CDATA approach only works for elements.
 
Michael Kay
http://www.saxonica.com/


From: Karr, David [mailto:david.karr@w...]
Sent: 29 September 2008 16:03
To: xml-dev@l...
Subject: Tradeoffs of XML encoding by enclosing all content in CDATA blocks

I pointed out to a client that they're seeing failures parsing XML because some of the element content that they're producing contains characters illegal in XML content, like "&" (unencoded).  They acknowledged that should be fixed, but they also said they could instead enclose all content with CDATA blocks.  That seems bizarre to me, but I'm not sure I can immediately come up with all the cogent arguments against that.  Can someone summarize specifically why you should NOT do that?



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member