[Home] [By Thread] [By Date] [Recent Entries]
At 7:42 AM -0700 4/29/03, Tim Bray wrote: >Really? I just looked at a recent set of Java docs, and it's pretty >clear that a Java char isn't really a character, it's a UTF-16 >codepoint, and the semantics of String are wrong for non-BMP >characters, and that the attempt at UTF-8 support remains pretty >laughably nonstandard and wrong. I'd be *delighted* to hear that >I'm looking at wrong/obsolete docs. Pointers anyone? -Tim Unfortunately, you're more than half right. The InputStreamReader and OutputStreamWriter classes do handle UTF-8 correctly. The readUTF and writeUTF methods in DataInputStream/DataOutputStream don't. This wouldn't be a problem if they were simply called readString/writeString instead. However, your comments about the char types are dead on. -- Elliotte Rusty Harold elharo@m... Processing XML with Java (Addison-Wesley, 2002) http://www.cafeconleche.org/books/xmljava http://www.amazon.com/exec/obidos/ISBN%3D0201771861/cafeaulaitA
|

Cart



