[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: UTF-8 vs UTF-16...?
Kragen Sitaker wrote: > > According to the latest Unicode book (is it version 2.0? Or 3.0?) > UTF-8 does not allow you to encode more than the first 17 planes of ISO > 10646. The Unicode book has a bias: it only talks about the Unicode aspects of UTF-8. I've always felt that to be a disservice, since they didn't develop or standardize UTF-8 and are thus spreading misinformation. (They could at least _mention_ the fact that they're presenting a Unicode subset of full UTF-8!) Better information is thankfully freely accessible. See: http://www.ietf.org/rfc/rfc2279.txt which includes the details of the five and six byte encodings. Note that even with a four byte subset of UTF-8, you can encode characters that can't be expressed in Unicode. A few of the test cases in the OASIS/NIST test suite (these cases happen to come from James Clark's XMLTEST package) have such characters; and any conformant XML processor must report a fatal error when it sees them. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@i... the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|