CDATA by any other name... (was The raw and the cooked)

From: <david@m...>
To: XML Dev <xml-dev@i...>
Date: Fri, 30 Oct 1998 10:36:00 -0500 (EST)

Play the video

So, Henry's asking whether this is valid:

  <!DOCTYPE a [
    <!ELEMENT a (b, c)>
    <!ELEMENT b EMPTY>
    <!ELEMENT c EMPTY>
  ]>
  <a><![CDATA[  ]><b/><c/></a>

I'd like to hear Tim Bray's opinion, unless I've missed it already in
this thread (are you reading this, Tim, or alternatively, do you have
an e-mail filter that looks for your name?).

My hunch is that this example *is* valid -- after all, the "<![CDATA["
just means "change to special delimiter-recognition mode" and "]]>"
means "go back to the regular delimiter-recognition mode": neither
implies anything about the contents.

Now, I'd like to reply to what John Cowan wrote:

 > In fact, CDATA elements can be returned to the application as
 > specialized blocks (the DOM allows it, though SAX does not do so);
 > an XML parser *never* needs to look inside a CDATA section after
 > its terminator has been found.

This is an interesting interpretation, but it requires some
clarification:

1. There is no such thing as a "CDATA element" -- CDATA sections are
   lexical.  In fact, the XML 1.0 REC does make it clear that the
   function of a CDATA section is purely escaping, and that its
   contents are equivalent to character data (see clause 2.7).  There
   is no explicit statement allowing parsers to treat CDATA sections
   specially (as there is for comments in clause 2.5), though there is
   no statement forbidding it either.

2. The XML 1.0 REC says nothing about what information the parser
   should deliver to the application when it encounters a CDATA
   section (the XML 1.0 REC is weak in this area generally, but we're
   working to fix the problem in the XML Information Set WG).
   Presumably, because of #1, the rule in clause 2.10 that "an XML
   processor must always pass all characters in a document that are
   not markup through to the application" applies here.

3. John's statement that the XML parser *never* needs to look inside a
   CDATA section after finding the terminator is not quite right --
   all of the contents must match the production Char [2], so the
   parser has to check to make certain that there are no illegal
   characters within the section (such as form feeds).  The version of
   AElfred that I wrote doesn't bother to do this checking, but it's
   non-conforming (I haven't checked Matt's latest versions).

All the best,

David

-- 
David Megginson                 david@m...
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@i... the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@i... the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@i...)

Follow-Ups:
- Re: CDATA by any other name... (was The raw and the cooked)
  - From: John Cowan <cowan@l...>
- Re: CDATA by any other name... (was The raw and the cooked)
  - From: "Richard L. Goerwitz III" <richard@g...>
- Re: CDATA by any other name... (was The raw and the cooked)
  - From: ht@c... (Henry S. Thompson)

References:
- Re: there's empty, and then there's REALLY empty
  - From: "James Tauber" <jtauber@j...>
- Re: there's empty, and then there's REALLY empty
  - From: ht@c... (Henry S. Thompson)
- Re: there's empty, and then there's REALLY empty
  - From: John Cowan <cowan@l...>
- Re: there's empty, and then there's REALLY empty
  - From: <david@m...>
- Re: there's empty, and then there's REALLY empty
  - From: ht@c... (Henry S. Thompson)
- The raw and the cooked (was: there's empty ...)
  - From: John Cowan <cowan@l...>

Prev by Date: XML CASE Structure
Next by Date: Re: CDATA by any other name... (was The raw and the cooked)
Previous by thread: The raw and the cooked (was: there's empty ...)
Next by thread: Re: CDATA by any other name... (was The raw and the cooked)
Index(es):
- Date
- Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Subscribe in XML format

RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >