[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: XML 1.1 grinds to halt?
Tim Bray scripsit: > Really? I just looked at a recent set of Java docs, and it's pretty > clear that a Java char isn't really a character, it's a UTF-16 > codepoint, and the semantics of String are wrong for non-BMP characters, > and that the attempt at UTF-8 support remains pretty laughably > nonstandard and wrong. I'd be *delighted* to hear that I'm looking at > wrong/obsolete docs. Pointers anyone? -Tim It's true that Java chars are UTF-16 codepoints; changing that would be nothing less than revolutionary. I don't understand what's wrong with the semantics of String, unless you mean that it's indexed by UTF-16 codepoints, which *is* what you are going to have 99% of the time. ICU/J provides correct-but-less-efficient indexing for when you need it. As for UTF-8, that's a canard. The methods DataOutputStream.writeUTF and DataInputStream.readUTF have nothing to do with UTF-8 text transport: they are *binary* methods that write and read a 16-bit byte length followed by modified UTF-8 (no 0x00 bytes). You use those only if you are doing roll-your-own binary serialization. The actual UTF-8 support is in InputStreamReader and OutputStreamWriter and is entirely compliant. -- John Cowan <jcowan@r...> www.ccil.org/~cowan www.reutershealth.com Micropayment advocates mistakenly believe that efficient allocation of resources is the purpose of markets. Efficiency is a byproduct of market systems, not their goal. The reasons markets work are not because users have embraced efficiency but because markets are the best place to allow users to maximize their preferences, and very often their preferences are not for conservation of cheap resources. --Clay Shirkey
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|