[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: 5 Whitespace Rules

  • From: "Neil Bradley" <neil@b...>
  • To: xml-dev@i...
  • Date: Sat, 9 Aug 1997 02:59:42 +0000

Re: 5 Whitespace Rules


> Reply-to:      Paul Grosso <paul@a...>

> At 23:13 1997 08 08 +0000, Neil Bradley wrote:
> >RULE 3. All other whitespace in element content  is  discarded.
> 
> >
> >Note that only the presence of spaces and tabs in element content,
> >which is not common, will cause discrepancies between validated and
> > non-validated processing.
> 
> This is the crux of the problem.  As soon as you say something about
> element content, you get different results from the document when
> you process the DTD and when you don't.  

Yes, but as I say, the problem only arises if people put spaces or
tabs in element content, which in my experience is very unusual.

> You don't say explicitly what happens when you don't process the
> DTD, but I assume your Rule 3 doesn't do anything in that case. 
> Therefore, your Rule 5 will turn all line-end codes into a space,
> and it is extremely common to have line-end codes in element
> content.  So your Rule 3 will cause you to end up with lots of
> spaces when you process in the absence of  a DTD that you wouldn't
> get when you process in the presence of the DTD.

No, Rule 2 has already dispensed with these CR and LF codes. I 
should have made it clear that this rule applies to non-validated
input.  So...

 <chapter>[CR]
 <note>[CR]
 <p>[CR]
 This is a para in a note[CR]
 </p>[CR]
 </note>[CR]
 ...

becomes

 <chapter><note><p>This is
 a para in a note</p></note>...

...before Rules 3 and 5 are applied.

This was my whole point about separating line-end code processing from
spacing character processing.

> >
> >RULE 4.  Line-end codes are discarded when preceded by a hard or
> >soft ('&#176;') hyphen (and a soft hyphen is also discarded).
> >Remaining line-end codes are treated as spaces.
> 
> This might be a nice heuristic for incoming WP files, but it doesn't
> agree with SGML.  If I had "a - b" in my document and a line-end
> happened to occur after the -, you'd turn my file into "a -b".

Yes, well, I can only suggest this is unlikely to happen, and in any
case Rule 4 is only a suggestion for paginating applications. I am
open to suggestions here, but for now I am far more concerned about
the Rules 1 to 3.

> paul

Neil.

-----------------------------------------------
Neil Bradley - Author of The Concise SGML Companion.
neil@b...
www.bradley.co.uk

xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@i... the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@i...)


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.