[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: Unix/Java design issues (Was: Re: Is CDATA "structure"?)
At 02:48 PM 7/21/99 -0400, Hunter, David wrote: >"all XML processors <em>must</em> accept the UTF-8 and UTF-16 >encodings of 10646" (emphasis added), since [I believe] UTF-8 and UTF-16 are >the most common ways to store Unicode characters. Unfortunately, no. I suspect that if you took a worldwide inventory, the four most common formats would be: 1. A Microsoft codepage that is almost but not quite ISO-8859-1 2. ASCII 3. EBCDIC 4. Shift-JIS (not necessarily in that order) Pure ASCII is UTF-8 as it sits, but as the Net becomes less and less Anglocentric, there is amazingly little pure ASCII being created any more. The XML spec chose UTF-8 and UTF-16 because unlike the other specimens in the list above, they can encode data containing arbitrary mixtures of different character sets. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|