|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: control characters
> >The workaround I usually suggest is to represent control characters >with (references to) characters from the Unicode private use range. >This makes the necessary transformation a simple character >substitution (which can even be just a subtraction - no need for a >table). > > -- Richard Actually, as someone has already pointed out, 0x007F - 0x009F are fair game for XML documents, and Unicode has these defined as control character aliases. Mapping 0x0000 - 0x001F to the private use area sounds like the "correct" unicode thing to do, But for US-ASCII/UTF-8 documents I would map to 0x0080 - 0x009F instead. This way you preserve the deprecated anglo centric english-only bigoted assumption of 1 character == 1 byte. The only downside is that someone might actually have data in this range. I think this is about as likely as someone having data in the private use area. XSLT will not _ALWAYS_ give you a perfect output format. XML --> XSLT --> simple_text_filter seems like a win to me. -Wayne Steele ________________________________________________________________________ Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com *************************************************************************** This is xml-dev, the mailing list for XML developers. To unsubscribe, mailto:majordomo@x...&BODY=unsubscribe%20xml-dev List archives are available at http://xml.org/archives/xml-dev/ ***************************************************************************
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








