Re: Gag me with a blunt …
At 06:00 PM 16/03/01 -0500, John Cowan wrote: >XML 1.0 chose not to implement non-ASCII whitespace characters from Unicode 2.0. To be honest, I don't think we ever articulated the principle, or proceeded from it. However, the discussion over Ideographic Space, U+3000, was long and agonized, thus the decision to omit it from the production for "S" was not lightly taken at all. It's hard to see how you could let in U+0085 without letting in ideospace and a bunch more characters that have in some respect the characteristics of white-space-ness. Someone sent me a note offline giving a long list of such items, and it's pretty clear that letting in U+0085 could start us down a slippery slope. Note (although no processor other than Lark ever did this as far as I know) that if you want to build a DFA-based XML processor, you can use the trick of recognizing all the syntax characters with a 7-bit state table and a remarkably small amount of clever sidestepping is required to deal with all the non-ASCII characters. -Tim
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format