|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Gag me with a blunt …
At 06:00 PM 16/03/01 -0500, John Cowan wrote: >XML 1.0 chose not to implement non-ASCII whitespace characters from Unicode 2.0. To be honest, I don't think we ever articulated the principle, or proceeded from it. However, the discussion over Ideographic Space, U+3000, was long and agonized, thus the decision to omit it from the production for "S" was not lightly taken at all. It's hard to see how you could let in U+0085 without letting in ideospace and a bunch more characters that have in some respect the characteristics of white-space-ness. Someone sent me a note offline giving a long list of such items, and it's pretty clear that letting in U+0085 could start us down a slippery slope. Note (although no processor other than Lark ever did this as far as I know) that if you want to build a DFA-based XML processor, you can use the trick of recognizing all the syntax characters with a 7-bit state table and a remarkably small amount of clever sidestepping is required to deal with all the non-ASCII characters. -Tim
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








