[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: control characters
> Mapping 0x0000 - 0x001F to the private use area sounds like > the "correct" > unicode thing to do, But for US-ASCII/UTF-8 documents I > would map to 0x0080 > - 0x009F instead. > This way you preserve the deprecated anglo centric > english-only bigoted > assumption of 1 character == 1 byte. > > The only downside is that someone might actually have data > in this range. I > think this is about as likely as someone having data in the > private use > area. > -Wayne Steele In a case you may be interested: there is a lot of charsets/encodings using this range as well. The reason was historical: using it allowed to preserve Latine alphabet in its' ASCII place while having national alphabets at the same time. That's not directly related to the question, but this makes the chance "that someone might actually have data in this range" much higher. Eldar Musayev *************************************************************************** This is xml-dev, the mailing list for XML developers. To unsubscribe, mailto:majordomo@x...&BODY=unsubscribe%20xml-dev List archives are available at http://xml.org/archives/xml-dev/ ***************************************************************************
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|