[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] CONSTANT_Utf8_info [Re: XML 1.1 grinds to halt?
when one considers java's implementation-specific 8-bit external string encoding, one should keep its purpose[1] and the specified relation to java's primitive data representations[2] in mind. [1] http://java.sun.com/j2se/1.4.1/docs/api/java/io/DataInputStream.html [2] http://java.sun.com/docs/books/vmspec/2nd-edition/html/ClassFile.doc.html#20080 Miles Sabin wrote: > > Elliotte Rusty Harold wrote, > > At 7:42 AM -0700 4/29/03, Tim Bray wrote: > > > Really? I just looked at a recent set of Java docs, and it's pretty > > > clear that a Java char isn't really a character, it's a UTF-16 > > > codepoint, and the semantics of String are wrong for non-BMP > > > characters, and that the attempt at UTF-8 support remains pretty > > > laughably nonstandard and wrong. I'd be *delighted* to hear that > > > I'm looking at wrong/obsolete docs. Pointers anyone? -Tim > > > > Unfortunately, you're more than half right. The InputStreamReader and > > OutputStreamWriter classes do handle UTF-8 correctly. The readUTF and > > writeUTF methods in DataInputStream/DataOutputStream don't. This > > wouldn't be a problem if they were simply called readString/ > > writeString instead. > > Yup, that's right ... for all intents and purposes, readUTF and writeUTF > should be treated as specifying a non-standard encoding solely for the > use of Java RMI. >
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|