[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: SAX: Whitespace Handling (question 5 of 10)

  • From: Peter Murray-Rust <peter@u...>
  • To: Michael Kay <M.H.Kay@e...>, xml-dev@i...
  • Date: Wed, 07 Jan 1998 01:01:21

sax ignore whitespace
At 14:16 05/01/98 -0000, Michael Kay wrote:
>>BTW: IMHO, IFF there is going to be a "default implementation" anyway, I
>>would actually prefer an "ignorableWhitespace" method which calls charData
>>by default. This will permit cleaner implementations.
>
>
>I may be simple-minded, but surely the default action with ignorable white
>space should be to ignore it?

Not simple-minded :-)

The whitespace issue is not trivial, but is (I think) consistent. The
*parser* has no option except to pass all characters that are not markup to
the application. This means that in:
<FOO>
  <BAR/>
</FOO>

A parser MUST pass the equivalent of

<FOO>\n\s\s<BAR></BAR>\n</FOO>

to the application.  

In a well-formed document there is NO indication of which character data
are/are_not significant ("ignorable") so by default the application will
have a tree structure where FOO has 3 children.

FOO
  "\n\s\s"
  BAR
  "\n"

If the application is told through
stylesheets/PIs/hardcoded_semantics/telepathy/a_human that all whitespace
is ignorable, fine - but it is NOT part of the XML spec.

If the DTD reads:

<!ELEMENT FOO (BAR)>

the "validating parser" (and we are still struggling with exactly what one
of those is :-) MUST tell the application:

"Hey! Be  careful! I've sent you a FOO, but it has element-only content, so
you may wish to ignore all the whitespace-only children of the FOO". The
application should say thank you, and then do whatever it feels like doing
with this information.

HOW the parser tells the application is what we are tackling.  DavidM has
suggested that when the "ignorable whitespace" is emitted from the parser,
it generates a special event. This seems reasonable - I suppose there could
be other methods (even simply announcing which elements had element-only
content should be sufficient).

[Please shoot this down if I've got it wrong :-)].

	P.
Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@i... the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@i... the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@i...)


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.