[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Whitespace
>Sean Mc Grath wrote: >>> Peter Murray-Rust's post removed to conserve space > >**Warning:** Rush of blood to the head follows. Get those flame throwers >ready... > >I know this whole white space thing was trashed out at length some time ago but >it worries me greatly that on XML-DEV the whole issue seems to be as problematic >as it was before XML-Lang's rulings on whitespace handling where decided upon. >It seems that the problem was not really solved - just pushed up a layer:-) > >It just sounds wrong to me that white space handling is to be the subject of >application conventions rather than part of the core XML parsing activity. > >Anyway, I think everyone should be allowed over-simplify the "White Space >Problem" >once in there lives! Here is my contribution:- > > >Ban mixed content. Mixed content is a markup minimization feature. > >If you want a chunk of PCDATA in an XML doc, use the <PCDATA> >reserved element name. > ><foo> > <pcdata>I am data 1</pcdata> > <pcdata>I am data 2</pcdata> ></foo> > >Becomes ><foo><pcdata>I am line 1</pcdata><pcdata>I am line 2</pcdata></foo> > >If you need whitespace to be something other than whitespace- i.e. a >newline to be a real newline to be passed on to the application, use an >empty element type to represent it. > ><foo> > <pcdata>I am data 1</pcdata><newline/> > <pcdata>I am data 2</pcdata> ></foo> > > >Give me five minutes to put on the asbestos suit and then you flame >away.... > Instead of flaming you I will hope onto the bandwagon (can I borrow the asbestos suit for awhile). Firstly to paraphrase some earlier comments, the "whitespace problem" has resulted from its dual personality. Personality 1. The programmer's whitespace ("pretty printing") is used as a layout tool for visual editing of the markup and content. Besides, lots of editing applications won't allow lines over 250 characters. Personality 2. The whitespace is part of the content used because the author either wanted it that way or he/she could not see any other easy way to encode the information correctly. SGML tried to cater for both personalities and it succeeded in a moderate fashion. The downside was that it is not an easy task to maintain and process SGML documents. Now for some personal opinion on what I thought XML was all about. XML is an attempt to either simplify SGML (get rid of or change the bits which make it hard to understand/use/process) or extend HTML to deal with information content as well as presentation. I lean towards the former view "SGML for the Web". IMHO the current XML "whitespace handling" has not simplified the SGML situation significantly. Here are some comments and slight variations on Sean's suggestion. I belive that Sean's suggestion has plently of merit. What is wrong with having some standard elements (<PCDATA>,<CDATA>,<NEWLINE>)which are part of every XML DTD? If you didn't want users to have to author these tags then "normalisation" applications could be developed which could convert "raw" XML into the "normalised" version. Example: <foo> I am data 1 I am <emph>data</emph> 2 </foo> could be normalised to: <foo> <pcdata>I am data 1</pcdata><newline/> <pcdata>I am data 2</pcdata> </foo> or <foo><pcdata>I am data 1 I am <emph>data</emph> 2</pcdata> </foo> depending on the DTD declarations for the elements or a style sheet (?!!) However, normalisation is not needed if the authors can be given tools which can produced the desired markup. Thus, all whitespace in the "normalised" documents could be collapsed to a single space (because we removed personality 2 we are only left with pretty printing). I will stop rambling now. IMHO the solution lies in removing the dual personalities of whitespace at document authoring time (or at its interface to XML tools for documents tagged by human hand). Regards, Bill Regards, Bill Donoghoe bdonoghoe@a... InfoTech (NSW) Pty Ltd mobile: 014 625 397 (in Australia) SGML/HyTime/DSSSL/XML Consultancy and Development xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@i... the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|