Re: How to read the encoding of an XML document
[James Garriss] > At 08:30 PM 10/25/2001 +0100, David Carlisle wrote: > >> If I no longer know what my original XML document was encoded as, how do I > > > know the appropriate encoding set to specify for the output? > > > >every xml application is mandated to support at least the utf8 and utf16 > >encodings, so either of those is always appropriate (or at least > >acceptable) whatever the original encoding of the file. > > Ok. If you recall, I started this discussion by mentioning that I am > receiving XML documents from several European countries. So the pertinent > question for me is "if UTF-8 and/or UTF-16 will be the output encoding set > I must use, will they handle charcters from the languages I care about?" > > I found this statement on unicode.org: > > "What Characters Does the Unicode Standard Include? The Unicode Standard > defines codes for characters used in the major languages written today. > Scripts include the European alphabetic scripts, Middle Eastern > right-to-left scripts, and scripts of Asia." > > So it seems to me that I should be safe outputing my data to UTF-16. That > make sense? > Yes. At least, any xml processor would be able to handle either utf-8 or utf-16. What may be displayed by a browser or word processor, though (if you transform it into a displayable document), is another question. utf-8 might be a better choice depending on what is going to consume it. Cheers, Tom P XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format