|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Unicode surrogate block in XML?
Tony Graham (tgraham@m...) Fri, 17 Sep 1999 01:15:51 -0400 (EST) >> In any XML document, you can make numeric references to any Unicode character in the range #x10000 to #x10FFFF (as well as to any other legal character number). These references are independent of the encoding used in the XML document. << Is it really correct to refer to #x10FFFF, say, as a Unicode character, since Unicode characters are limited to 16 bits? I'd think it's necessary here to refer to that as a UCS-4 character. >> The sequence of #xD800 #xDC00 is the two Surrogate code values that address #x10000. That four-byte sequence may occur in a UTF-16 encoded file to represent #x10000. In contrast, "��" in an XML document is two illegal character references in a row. << I've been trying to fathom the distinction between Unicode and UTF-16, if there is one, and how these in turn relate to the UCS-2 encoding of ISO 10646. There's also the question of whether an XML document can be stored directly in Unicode, or whether instead it must be stored in either UTF-8 or UTF-16, as Section 2.2 seems to imply when it says ``all XML processors must accept the UTF-8 and UTF-16 encodings of 10646''. The latter appears to be the case; but if it isn't, then how would an XML document be stored directly in Unicode? I've pondered both Appendix C of the Unicode Standard and the relevant part of the FAQ on the Unicode website, and I'm still unclear about all of this. (By the way, the FAQ erroneously refers to UTF as the Unicode Transformation Format rather than the UCS transformation format.) In any event, thanks, Tony, for your very enlightening response to my original query. Paul Abrahams xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








