[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Whitespace
-----BEGIN PGP SIGNED MESSAGE----- Bill> Bill Donoghoe <URL:mailto:bdonoghoe@s...> >> Sean Mc Grath wrote: >> If you want a chunk of PCDATA in an XML doc, use the <PCDATA> >> reserved element name. >> Give me five minutes to put on the asbestos suit and then you flame >> away.... Bill> Instead of flaming you I will hope onto the bandwagon (can I Bill> borrow the asbestos suit for awhile). Is it big enough for three? ;-) I believe that the PCDATA element proposal has merit, but not that it's the solution to all ills. If it's defined well, and if XML processors do the Right Thing, it would eliminate misunderstandings between author and reader. But it might be too verbose for some authors (who might be happy with application-defined rules, for instance). But if PCDATA is defined as "<!ELEMENT pcdata (#PCDATA)>", then it might be sensible to install specific whitespace rules. I'm leery of recommending PRESERVE semantics, because <ul> <li><p>The first (and only) item in the list.</p> </li> <ul> occurs in many sources, and the naive transformation (enclosing all #PCDATA with <PCDATA></PCDATA>) includes some "programmer's whitespace" at the beginning of the line after <li>. Bill> What is wrong with having some standard elements Bill> (<PCDATA>,<CDATA>,<NEWLINE>)which are part of every XML DTD? CDATA wouldn't work, for a start. Bill> If you didn't want users to have to author these tags then Bill> "normalisation" applications could be developed which could Bill> convert "raw" XML into the "normalised" version. Bill> Example: Bill> <foo> Bill> I am data 1 Bill> I am <emph>data</emph> 2 Bill> </foo> Bill> could be normalised to: Bill> [...] Bill> or Bill> <foo><pcdata>I am data 1 I am <emph>data</emph> 2</pcdata> Bill> </foo> Er, don't you mean > <foo><pcdata>I am data 1 I am </pcdata><emph><pcdata>data</pcdata > ></emph><pcdata> 2</pcdata></foo> ? IWUTI that the PCDATA element was to contain *only* character data, and no elements (i.e. RCDATA in SGML terms). As in my element declaration above, and with #PCDATA in content models replaced with PCDATA throughout. [For non-validating processors: we'd need a way to indicate that this convention is in use] -----BEGIN PGP SIGNATURE----- Version: 2.6.3i Charset: noconv Comment: Processed by Mailcrypt 3.4, an Emacs/PGP interface iQB1AwUBM/m6uedsuUurvcRtAQHgUwL/UR/UZ5XUhzEH84s+67Ulu5P09B5G2OxF ahFvctktCu0KuzClfmkZiQUbHS7adGvlxfFtm5da2tgsqOszEPONQOfjyR9S3D6C Qr3andAcVy9+wsfB65yd0eqsMUBhctZe =0PcY -----END PGP SIGNATURE----- -- xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@i... the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|