Re: Fw: Encodings and how they're specified
Hermann Stamm-Wilbrandt scripsit: > So an XML processor/parser should be able to deal with ebcdic.xml and > correctly determine its "ebcdic-de" encoding, right? "Should" is too strong. Many, if not most, XML parsers will not understand this encoding, though in that case they should successfully reject it. Appendix F explains how to identify a generic EBCDIC XML document by looking for the "4C 6F A7 94" bytes with which it must begin, though it is still necessary to read through the encoding declaration in order to determine the exact flavor of EBCDIC in use. The invariant character set (00640) can be used to decode the specified encoding name, unless the encoding is code page 290, which does not have lower-case Latin letters anyway. http://recycledknowledge.blogspot.com/2005/07/hello-i-am-xml-encoding-sniffer.html gives a detailed algorithm. -- Note that nobody these days would clamor for fundamental laws John Cowan of *the theory of kangaroos*, showing why pseudo-kangaroos are firstname.lastname@example.org physically, logically, metaphysically impossible. http://www.ccil.org/~cowan Kangaroos are wonderful, but not *that* wonderful. --Dan Dennett on zombies
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format