[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: C1 characters in XML 1.0 and HTML 4
* Waters, Michael, Springer US wrote: >Unicode character U+0092 is given as a control character in a private >use area. I can't see our vendor or any workflow step (un)intentionally >adding that character. About the only thing that makes sense to me is >that at some point (probably the source document), Windows-1252 encoding >was used, where decimal 146 is, I think, a right single quote. (Whether >that's the appropriate character in this case is another matter.) That is likely, yes. It might also come from some other set like Mac- Roman, though I've not checked what the code represents there (and I would not know if this wasn't a typo to begin with.) >So, in all the XML processes, character U+0092 was passed through as >legal, but in outputting to HTML it is illegal? I'm missing something >here, surely. XML 1.0 documents may use C1 control characters. Obviously in you case you don't seem to actually mean to use C1 control characters. (Not that anyone should care, but XML 1.1 allows the C1 control characters, but only in the form of character references. And the SGML declaration for HTML 4.01 does mark the C1 control character as unused.) So, there is little consensus there about the status of C1 controls. -- Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de 25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|