Re: Tradeoffs of XML encoding by enclosing all content in CDAT
2008/9/29 Karr, David <david.karr@w...>: > I pointed out to a client that they're seeing failures parsing XML because > some of the element content that they're producing contains characters > illegal in XML content, like "&" (unencoded). They acknowledged that should > be fixed, but they also said they could instead enclose all content with > CDATA blocks. That seems bizarre to me, but I'm not sure I can immediately > come up with all the cogent arguments against that. Can someone summarize > specifically why you should NOT do that? You often get this problem when people write XML as a string rather than using a proper XML Writer... For example: xmlStr = "<foo>" + someVal + "</foo>"; write(xmlStr); The are several problems with this approach, one being that ampersands won't be escaped properly. The answer they usually go for is to replace all occurrences of & with &_amp; but then you see double escaping &_amp;amp; of character and entity references. Then you get the string &_amp;amp; in the result, which appears as "&_amp;" in the browser, so they attached a post processing step to convert "&_amp;amp;" to &_amp; ....and so on and so on. (you also see these pre- and post-processing steps to get around encoding issues) The root cause of all of this, is that someone wrote XML as string rather than using an XML Writer. So I would suggest finding out how they create the XML, and go from there. -- Andrew Welch http://andrewjwelch.com Kernow: http://kernowforsaxon.sf.net/
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format