[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Pushing all the buttons

java utf 16
At 8:33 AM -0400 9/21/03, David Megginson wrote:

>I think that James was talking about going from bytes representing a
>Unicode character encoding, not a binary encoding.  There should be no
>platform dependencies in that case.

I understood that, and my point still holds. There are platform 
dependencies in this case. If the native char and string types are 
built on UTF-8 (Perl, maybe?)  then this is straightforward., 
However, when the native char and string types are based on UTF-16 a 
conversion is necessary. Ditto for UTF-16BE to UTF-16LE and vice 
versa. Or UTF-8/UTF-16 --> UTF-32. Languages and platforms do not 
share the same internal representations of Unicode. No one binary 
format will work for everyone.

This conversion is non-trivial too. In the current version of XOM I 
made deliberate decision after profiling to store internal text node 
data in UTF-8 rather than UTF-16. That saves me a *lot* of memory. 
However, the constant conversion to and from the internal UTF-8 
representation to Java's UTF-16 representation imposes about a 10% 
speed penalty. I chose to optimize for size instead of speed in this 
case, but I wouldn't suggest imposing that cost on everyone by making 
all XML data UTF-8.


   Elliotte Rusty Harold
   Processing XML with Java (Addison-Wesley, 2002)


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.