[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: CDATA sections in W3C XML Infoset

  • From: Bob Kline <bkline@r...>
  • To: XML-Dev Mailing List <xml-dev@l...>
  • Date: Fri, 30 Mar 2001 10:54:59 -0500 (EST)

w3c cdata
John Cowan writes:

> Richard Lanyon scripsit:
> 
> > Could someone explain to me why CDATA section start/end markers were
> > taken out of the W3C Infoset?
> 
> Essentially because we (but I am NOT speaking officially for the
> Core WG here) don't think they carry any real information.  There is
> no difference between CDATA sections and careful use of &lt; and
> &amp;.

No?  We have quite a bit of code in our XML repository which uses XML
commands over sockets for its client-server interface to the rest of the
world.  Most of the commands embed an XML document being stored in or
retrieved from the repository.  The embedded documents are wrapped in
CDATA sections.  The logic for extracting a document from an incoming
client command is essentially:

   Find the element containing the CDATA section.
   Find the CDATA child of the element.
   Hand the value of the CDATA section to the parser.

That doesn't work if some process in the pipeline replaces the CDATA
section with an escaped text node.  You may be silently adding some
mental qualifications to your phrase "no difference" but from where I
stand if it breaks software that we've written to a W3C interface (the
DOM) there's a difference, any amount of sophistry notwithstanding.  I
see that there are tools out there already (e.g., Lars Marius Garshol's
roundtrip.py [1]) which make the same assumption you are making.  This
is lovely news.

Before you even think about suggesting how easy it would be to restore
the angle brackets in the embedded document, let me point out that the
&lt; and &gt; which are not delimiters for the element tags in the
embedded document cannot be "restored" to < and >, and I submit that it
is impossible in some cases to distinguish which those were.  Therefore
information has been lost.

Before you suggest that the embedded document should not have been
wrapped in a CDATA section in the first place, let me say that:

  1. not doing so would make it impossible to validate the commands
     and their responses against their DTDs; and
  2. we have a requirement to be able to store a document in progress
     even in the case in which it is not well-formed (some of the
     documents in the repository will be imported from outside sources
     and we must capture the original documents whatever their state).

[1] http://cvs.sourceforge.net/cgi-bin/cvsweb.cgi/python/dist/src/Demo/xml/roundtrip.py?cvsroot=python

-- 
Bob Kline
mailto:bkline@r...
http://www.rksystems.com



PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.