[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Attribute normalisation and character entities

  • From: Arjun Ray <aray@q...>
  • To: xml-dev@i...
  • Date: Mon, 24 Jan 2000 11:55:30 -0500 (EST)

character entities 10

On 24 Jan 2000, Richard Tobin wrote:

> Section 3.3.3 seems to me to say that character references are not
> subject to the translation to #x20 [...] 
> The errata (http://www.w3.org/XML/xml-19980210-errata) re-writes this
> section but does not appear to change it in this respect.
> However the Oasis test suite, in tests sa02 and not-sa02, requires
> that they are replaced with spaces.
> Which is correct?

If the intent is to do it the SGML way, then 3.3.3 is correct.  In fact, I
think 3.3.3 (as clarified in the errata) is the best explanation I've seen
of this!:-)

The SGML gotcha here has to do with the 'SEPCHAR' category.  A numeric
character reference is always character data at the point it occurs, and
so doesn't get *parsed* as SEPCHAR (and thus thereafter normalized for
non-CDATA declared values.)

Try this file with nsgmls:

<!DOCTYPE foo [
  <!ELEMENT foo - - (#PCDATA) >
  <!ATTLIST foo
            bar   CDATA #IMPLIED
            baz   NAMES #IMPLIED
<foo bar="blah1&#10;blah2" baz="grape&#10;banana">...</foo>

This won't validate.  So

a) Replace '&#10;'  with '&#RE;'.  Now, it will validate. (because RE is a
   SEPCHAR when parsed.)
b) Replace with '&lf;'  and add a declaration in the DTD

    <!ENTITY lf  "&#10;" >

  This, too, will validate (because the character reference substitution
  occurs when the entity declaration is *parsed*, and so is a regular
  literal whitespace character by the time the entity reference is used.)

c) Change the entity declaration to 

    <!ENTITY lf  CDATA "&#10;" >

and now, it won't validate any more. (because the recursive parsing rule
has been short-circuited.)

d) Repeat (b) and (c) with 'RE' for '10' in the entity declaration.  Same
difference in results. 

Ain't this fun?;)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ or CD-ROM/ISBN 981-02-3594-1
Unsubscribe by posting to majordom@i... the message
unsubscribe xml-dev  (or)
unsubscribe xml-dev your-subscribed-email@your-subscribed-address

Please note: New list subscriptions now closed in preparation for transfer to OASIS.


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.