[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Ambiguities in section 4.3.2 of XML 1.0 SE

  • To: xml-dev@l...
  • Subject: Ambiguities in section 4.3.2 of XML 1.0 SE
  • From: "Perry A. Caro" <caro@a...>
  • Date: Mon, 11 Aug 2003 17:02:34 -0700

cdata xml converter
[I sent the following to xml-editor@w.... Am I completely crazy, or are
some clarfications called for in the spec? Would you think either of the
following examples were not well-formed?

Example: <foo bar="]]>"/>

Example: <!ENTITY cdend="]]>">
         <foo bar="&cdend;"/>

With respect to section 4.3.2 of the XML 1.0 Specification Second Edition
and by implication XML 1.1 CR, there appear to be several ambiguities
engendered by the following statement:

  An internal general parsed entity is well-formed if its replacement
  text matches the production labeled content.

... when considered in the context of CDATA Sections and "]]>". For example,
this would imply that the following declaration in an internal DTD subset
would result in an internal general parsed entity that is not well-formed:

<!ENTITY cdend "]]>">

... because the replacement text does not match the [43] content production.

If so ...

1) This contradicts statements about Literals in section 2.3, namely:

  Literal data is any quoted string not containing the quotation mark
  used as a delimiter for that string. Literals are used for
  specifying the content of internal entities (EntityValue),

... and production [9] EntityValue. Production [9] permits "]]>" as a
replacement text.

Furthermore, [10] AttValue also permits "]]>". It would be nonsensical for
<foo bar="]]>"/> to be well-formed, but not <foo bar="&cdend;"/>, using the
entity declaration above.

2) This contradicts the last paragraph of section 4.3.2:

  A consequence of well-formedness in entities is that the logical
  and physical structures in an XML document are properly nested; no
  start-tag, end-tag, empty-element tag, element, comment, processing
  instruction, character reference, or entity reference can begin in
  one entity and end in another.

The list appears to be intended to be exhaustive. The lack of "CDATA
Section" in the list might be interpreted to mean that you can start a CDATA 
Section in one entity, and end it in another. Therefore, the declaration of
&cdend; above should be well-formed.


Since the well-formedness of internal general parsed entities is completely
defined by productions [71] GEDecl, [73] EntityDef, and [9] EntityValue,
what is the value of the statement in section 4.3.2? What does it intend to

Perry A. Caro
Adobe Systems Incorporated


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.