[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: How to read the encoding of an XML document
I asked:
> When you say Unicode, does that equate to UTF-8, UTF-16, UTF-32 or > something else? Or does the answer depend upon the XML > parser you are > using, which in my case is MSXML3.0? Michael Kay wrote: When the XML is in a file on disk, each Unicode character is represented by one or more bytes, so it's reasonable to talk about encoding. When the XML has been parsed and is passed to your application via an API, the characters are typically variables of some data type depending on your programming language, so their binary representation is no longer of any concern. David Carlisle wrote: So your source might be in latin-2 and your stylesheet might be in latin-1 but by the time they have both been parsed everything is in abstract unicode characters and it is these that are compared in any XSLT query. (In fact MSXML3 uses utf16 but this is an internal detail that has no affect on the stylesheet) Ok, I think you two are saying the same thing. As long as my XML and my XSL are DOMDocuments, encoding is not relevant. In my case, they aren't going to stay DOMDocuments for long. I'm going to transformNodeToObject and save the results to a file, either as XML or HTML, depending upon the xsl:output method. If I no longer know what my original XML document was encoded as, how do I know the appropriate encoding set to specify for the output? In XML I was going to do xsl:output encoding="whatever the input xml was" In HTML I was going to do META content="text/xml; charset=whatever the input XML was" Very much appreciating the expert responses, --James Garriss XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|