[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Blueberry is not "closed"
0> In article <5.1.0.14.2.20010724225913.020f5760@p...>, 0> Tim Bray <URL:mailto:tbray@t...> ("Tim") wrote: Tim> Ouch, it's worse than I thought. One of the "nice" things about Tim> the UTF16 surrogate system is that if you don't have the apparatus Tim> around to deal with astral-plane chars, you can just obliviously Tim> treat 'em as pairs of characters you don't know. Except that you have to be careful about how you count "characters". Tim> But XML carefully rules out that possibility, prod [2] for "Char" Tim> rules excludes surrogate blocks. In retrospect, maybe that was Tim> dumb? In a Java environment, it's sensible to pass around surrogates in String objects - think of it as using UTF-16 as the internal representation, which is trivial if the input is UTF-16 and (potentially) less trivial otherwise. Production [2] doesn't say anything about what happens internally, of course, as this is external syntax - it rules out numeric character references to the surrogate area, or surrogate characters in UCS-2, etc. This actually makes things easier for a Java implementation, since whenever you see a character from the surrogate area, you know it's being used as one half of a surrogate pair. Tim> Which means in effect that Dave's right, basically you just totally Tim> can't use a java's String or char in dealing with Blueberry docs. Tim> Or am I missing something... please? It seems that you might need to at least temporarily combine surrogates whilst parsing (or write your parser such that UTF-16 state is taken account of), but I don't think the parser would need to retain the UCS-4 form, and it seems okay to pass UTF-16 to downstream components (as long as you don't split surrogate pairs!). Tim> Or re-open the door to the UTF-16 hack by putting the surrogate Tim> blocks back into [2] as part of the Blueberry update. Ugh! I knew a 16-bit char type would be a nuisance before too long! --
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|