XML and Using It With Whitespace
Sorry if the subject is confusing, but it's a really concise proposal for getting whitespace through - guaranteed. I earlier posted a query about the behaviour of various parsers surrounding whitespace. I guess I'm not as hopeful as I was earlier, at least based on the answers I received. Thanks to those who took the time to reply. Essentially, I had hoped that using   to replace a space would allow for the creation of a 'magic' difference, the same way that the < and < are treated differently. Ideally all spaces could become   and we could use the (invalid!) xml:space="none", leaving only the   behind. It appears that most of the parsers will have a tough enough time consistently declaring ignorable whitespace in element content - track where in PCDATA a   became a ' ' is just not on the radar. That doesn't mean I'm abandoning the idea - the message authentication we're doing is important enough to the application that I'm prepared to sacrifice the use of all the parsers to get the above behaviour. It doesn't hurt that we are likely going to have standalone applications processing the XML stream - it's not really a file-based system. I think parsers can still correctly read such files. But it points to a more general problem. If I read such a file with a parser, how can I write it out again exactly (and I mean *exactly*) the way it was read? If the parser doesn't indicate clearly where substitutions with entities were done, then I can't put them back in the file. The same problem occurs with empty elements. Although the XML spec wants to imply that <tag></tag> and <tag/> are the same, some might see them as the difference between a zero-length content and null content. Either way, if the original XML contains <tag><tag/>, then that is what should go back out. If it later contains <tag/> then the both references should remain different from each other and unchanged. To wrap up the options, I'll run through the same paragraph using three different techniques. 1....Basic. <p> Finally, the other idea is the one at the bottom - use elements for spaces, tabs, and lineends. There is a single attribute n to indicate repeat counts. </p> 2....Using character entities - still my favourite, since they work in attributes as well. Out of all of them, this, to my eyes, looks like it could easily have been placed in the XML 1.0 spec without breaking anything else that is in the spec, simply by adding the xml:space="none". &spc; could be   and &lf; is so no new entities would have to be added. <p xml:space="none">Finally,&spc;the&spc;other&spc;idea&spc;is&spc;the &spc;one&spc;at&spc;the&spc;bottom&spc;-&spc;use&spc;elements&spc;for&lf; spaces,&spc;tabs,&spc;and&spc;lineends.&spc;&spc;There&spc;is&spc;a&spc; single&spc;attribute&spc;n&spc;to&spc;indicate&lf;repeat&spc;counts.</p> 3.....With only elements. <p xml:space="none">Finally,<s/>the<s/>other<s/>idea<s/>is<s/>the<s/> one<s/>at<s/>the<s/>bottom<s/>-<s/>use<s/>elements<s/>for<l/> spaces,<s/>tabs,<s/>and<s/>lineends.<s n="2"/>There<s/>is<s/>a<s/>single <s/>attribute<s/>n<s/>to<s/>indicate<l/>repeat<s/>counts.</p> Clearly, you must have the DTD to make sense of the last one! However, I see a rather interesting side-effect, namely that this one could likely be added using a namespace. (Tangent: any parsers experimenting with namespaces?) In summary, the distinction is, as a reply noted, between "wanted" whitespace and "unwanted" whitespace. The XML specification wants to leave it to the application because there are far more 'whitespace convention sets' than it is desirable to put in the spec. However, there are far more applications than there are 'whitespace convention sets', and the application designer wants to pick one, not reinvent the wheel. This seems to be the missing middle ground. How can we reusably specify the relatively few whitespace options we need. which are more than the XML spec provides, but far fewer in number than the number of applications that we hope to see using XML? --------------------------------------------------------------------------- Chris Smith <smith@i...> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format