[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: SAX2 RFD: LexicalHandler draft v.1.1

  • From: Lars Marius Garshol <larsga@i...>
  • To: "XML Developers' List" <xml-dev@i...>
  • Date: 25 Mar 1999 11:01:43 +0100

consecutive text

* David Megginson
|     public abstract void startCDATA ()
| 	throws SAXException;
|     public abstract void endCDATA ()
| 	throws SAXException;

This implies that the parser reports the contents of CDATA sections as
separate DocumentHandler.characters events, which is of course the
most natural way to implement things anyway.

However, the 1999-03-12 list of core features contains this:

    Ensure that all consecutive text is returned in a single callback to
    DocumentHandler.characters or DocumentHandler.ignorableWhitespace
    (true) or explicitly do not require it (false).

This is potentially problematic, since it's unspecified what the
parser should do about CDATA sections in this case. (I suspect we will
see more problems of this kind when we start using really using and
stacking filters.) Should they be normalized, or should they be
reported separately? (Ie: what is consecutive text, exactly?) The same
problem appears with entity boundaries and character references.

I assume most users of normalize-text will want consecutive text to be
interpreted in the logical view of the document, rather than the
lexical view. Otherwise the DocumentHandler will receive different
events in these two cases:

  A problematic case.


  A <![CDATA[problematic]]> case.

which is rather fragile, and this behaviour should be avoided, IMHO.

So basically the problem is that normalize-text and LexicalHandler
don't go well together. You can have one, but not both at the same
time, unless the driver changes it's behaviour. In other words, this
seems to require the driver to have explicit knowledge about

Possible solutions:

 - reject normalize-text true if a LexicalHandler has been registered,
 and reject LexicalHandler registration if normalize-text has been set
 to true
 - make normalize-text have a logical interpretation by default, and
 switch to lexical if a LexicalHandler has been registered
 - make normalize-text always have a lexical interpretation
 - have separate normalize-text-logical and normalize-text-lexical
 events, with reject-behaviour for the first


--Lars M.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@i... the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@i... the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@i...)


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.