[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Is whitespace within general entities ignorable?

  • From: Rob Lugt <roblugt@e...>
  • To: John Cowan <jcowan@r...>
  • Date: Tue, 27 Feb 2001 17:39:09 +0000

doctype whitespace
John, thanks for your reply.

I was, however, surprised by the answer because I don't think it has been
widely implemented that way.

So, to clarify, given the following XML file:-

<?xml version = "1.0"?>
<!DOCTYPE doc [
<!ELEMENT doc (test)*>
<!ELEMENT test (child)*>
<!ELEMENT child EMPTY>
<!ENTITY ent1 "   ">
<!ENTITY ent2 "    <child/>   ">
]>

<doc>
 <!-- this is valid -->
 <test> &ent2; </test>
 <!-- but this isn't -->
 <test> &ent1; </test>
</doc>

This means that an XML processor that parses entities when they are
referenced cannot decide whether or not the case is valid until it comes
across an element start tag in the entity replacement text.  If it does,
then all the preceeding white space is insignificant - otherwise it is
invalid.

I can see how you come to this conclusion by carefully reading the wording
of "validity constraint: Element Valid" in Section 3 of xml[1], but I have
not found this to be implemented in any of the XML processors that I have
tried.  Is this just amother case of imperfect implementations?

> > Section 2.10 mentions the distinction between significant and
insignificant
> > white space but doesn't give a definition. The validity constraint in
> > Section 3(2) for element content talks about the white space surrounding
> > child elements having to match the non terminal S[3] - but is this after
> > entity substitution has been performed?
>
> No.  The whitespace in element content has to be real whitespace
> characters: not entity references, not CDATA sections.
>
> > <?xml version = "1.0"?>
> > <!DOCTYPE test [
> > <!ELEMENT test (child)*>
> > <!ENTITY entws "   ">
> > ]>
> > <test>
> >  &entws;
> > </test>
>
> Invalid.
>
>
> > If we say that it is illegal to have white space within the GE, then
> > something like this would also be illegal:-
> > <!ENTITY entws "   <child/>  ">
>
> Valid.
>
> > Secondly, what if the content model of <test> was changed to EMPTY?
Would
> > this make any difference to your view?  It appears to me that including
the
> > reference to &entws; creates content - which is illegal for EMPTY
elements.
>
> Right.  In an EMPTY element, the start-tag and the end-tag must abut,
> with nothing at all between them.
>
> > And finally, what if &entws; was declared as: <!ENTITY entws "&#9;">.
Would
> > that make any difference?
>
> No.
>

Kind regards
Rob Lugt
ElCel Technology
http://www.elcel.com

[1] http://www.w3.org/TR/REC-xml


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.