[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Unicode surrogate block in XML?
At 06:12 PM 9/16/99 -0400, Paul W. Abrahams wrote: >The XML 1.0 spec explicitly excludes the Unicode surrogate characters >from XML documents (production 2). It now seems, from information >I've picked up on the Unicode web site, that surrogate characters are >likely to play a more important role in the future, since the >available 16-bit characters are almost all used up. (Unicode 2.0 has >18,134 spares but Unicode 3.0 has only 7827 spares. The trend is >clear.) No. Production [2] says [2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] This follows the unicode model in allowing 17 planes of 64k characters each, i.e. about a million characters. For this to work in UTF-16, you need surrogate pairs. What XML rules out is *characters* whose numeric value is that of one-half of a surrogate pair. There will never be any such characters precisely because those values are reserved for use in surrogate pairs. That's why XML rules them out. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|