[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Whitespace

  • From: bdonoghoe@s... (Bill Donoghoe)
  • To: xml-dev@i...
  • Date: Tue, 19 Aug 1997 23:44:08 +1000 (EST)

xml whitespace newline

>Sean Mc Grath wrote:
>>> Peter Murray-Rust's post removed to conserve space
>
>**Warning:** Rush of blood to the head follows. Get those flame throwers
>ready...
>
>I know this whole white space thing was trashed out at length some time ago but
>it worries me greatly that on XML-DEV the whole issue seems to be as 
problematic
>as it was before XML-Lang's rulings on whitespace handling where decided upon.
>It seems that the problem was not really solved - just pushed up a layer:-)
>
>It just sounds wrong to me that white space handling is to be the subject of
>application conventions rather than part of the core XML parsing activity.
>
>Anyway, I think everyone should be allowed over-simplify the "White Space
>Problem"
>once in there lives! Here is my contribution:-
>
>
>Ban mixed content. Mixed content is a markup minimization feature.
>
>If you want a chunk of PCDATA in an XML doc, use the <PCDATA>
>reserved element name.
>
><foo>
>   <pcdata>I am data 1</pcdata>
>   <pcdata>I am data 2</pcdata>
></foo>
>
>Becomes
><foo><pcdata>I am line 1</pcdata><pcdata>I am line 2</pcdata></foo>
>
>If you need whitespace to be something other than whitespace- i.e. a
>newline to be a real newline to be passed on to the application, use an
>empty element type to represent it.
>
><foo>
>   <pcdata>I am data 1</pcdata><newline/>
>   <pcdata>I am data 2</pcdata>
></foo>
>
>
>Give me five minutes to put on the asbestos suit and then you flame
>away....
>
Instead of flaming you I will hope onto the bandwagon (can I borrow the 
asbestos suit for awhile).

Firstly to paraphrase some earlier comments, the "whitespace problem" has 
resulted from its dual personality.

Personality 1.  The programmer's whitespace ("pretty printing") is used as a 
layout tool for visual editing of the markup and content.  Besides, lots of 
editing applications won't allow lines over 250 characters.

Personality 2.  The whitespace is part of the content used because the 
author either wanted it that way or he/she could not see any other easy way 
to encode the information correctly.

SGML tried to cater for both personalities and it succeeded in a moderate 
fashion.  The downside was that it is not an easy task to maintain and 
process SGML documents.

Now for some personal opinion on what I thought XML was all about.  XML is 
an attempt to either simplify SGML (get rid of or change the bits which make 
it hard to understand/use/process) or extend HTML to deal with information 
content as well as presentation.  I lean towards the former view "SGML for 
the Web".  

IMHO the current XML "whitespace handling" has not simplified the SGML 
situation significantly.

Here are some comments and slight variations on Sean's suggestion.

I belive that Sean's suggestion has plently of merit.

What is wrong with having some standard elements 
(<PCDATA>,<CDATA>,<NEWLINE>)which are part of every XML DTD?

If you didn't want users to have to author these tags then "normalisation" 
applications could be developed which could convert "raw" XML into the 
"normalised" version.

Example:

<foo>
   I am data 1
   I am <emph>data</emph> 2
</foo>

could be normalised to:

<foo>
   <pcdata>I am data 1</pcdata><newline/>
   <pcdata>I am data 2</pcdata>
</foo>

or

<foo><pcdata>I am data 1 I am <emph>data</emph> 2</pcdata>
</foo>

depending on the DTD declarations for the elements or a style sheet (?!!)

However, normalisation is not needed if the authors can be given tools which 
can produced the desired markup.

Thus, all whitespace in the "normalised" documents could be collapsed to a 
single space (because we removed personality 2 we are only left with pretty 
printing).

I will stop rambling now.

IMHO the solution lies in removing the dual personalities of whitespace at 
document authoring time (or at its interface to XML tools for documents 
tagged by human hand).

Regards,
Bill


Regards,
Bill Donoghoe              bdonoghoe@a...
InfoTech (NSW) Pty Ltd     mobile: 014 625 397 (in Australia)
SGML/HyTime/DSSSL/XML Consultancy and Development


xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@i... the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@i...)


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.