[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Unicode confusion

  • From: "Peter S. Housel" <housel@h...>
  • To: <xml-dev@i...>
  • Date: Tue, 4 Jan 2000 12:02:26 -0800

xml encoding unicode
> No one's disagreeing with the use of Unicode; we're talking about
> which character encoding we'll use to represent it.  You can represent
> Unicode in variable-width 8-bit or 16-bit encodings or in fixed-width
> 32-bit encodings.

My reading of the Unicode 2.x standard is that the above isn't strictly
correct.  It is correct if you change "Unicode" to "the ISO 10646 Universal
Character Set" though.

> Note that Java uses UTF-16, which isn't quite fixed-width, though no
> one really notices.

It seems to me that Java uses Unicode, which maintains the semantics that 16
bits equals one character.  Surrogates are characters in Unicode, whereas
those code points are not legal UCS characters, but only artifacts of the
UTF-16 encoding.

Unicode looks like UTF-16, but the semantics are slightly different.  So a
file using UTF-16 encoding containing a single "astral plane" character of
the UCS would be interpreted by Unicode as a file containing *two* surrogate
characters.  (I think it's a strange tack to take, but it seems fairly clear
to me that this was their position as of Unicode 2.x.  I haven't looked at
3.0 yet, so things may have changed since then.)

The XML character set is the UCS, not Unicode.

-Peter-    housel@a...

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@i... the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@i... the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@i...)


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.