[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Missing byte-order mark problem
Vivek Shinde wrote: > For last two days I was struggling with a problem of applying a > XSL stylesheet to XML that had Danish characters (using entities > like ø etc.). The output=HTML was working fine but when I > tried to get text output I kept getting "Missing byte-order mark". > I tried it with encoding of UTF-8 as well as UTF-16, it did not work. > Finally I found a listing on google from this group from way back > in 2002 http://www.xslt.com/xsl-list/2002-02/msg00675.html and it > suggested to use encoding="iso-8859-1" and walla...it worked. Trial and error is not a very good way to go about document authoring or XSLT programming. In the prolog of an XML document, encoding="iso-8859-1" is an assertion that the document's bytes map to Unicode characters according to the iso-8859-1 encoding. This declaration may be entirely false, as you may have saved the document in UTF-8 or UTF-16 or some other format. It is required to be a truthful statement, though, by the XML spec, so that an XML parser will know how to interpret the bytes. In the xsl:output instruction element, encoding="iso-8859-1" is there to notify the XSLT processor that after it is done building the result tree, you would like it to be serialized as bytes according to the iso-8859-1 encoding. "Missing byte-order mark" indicates that your XML parser is trying to read a document under the assumption that it is utf-16 encoded (1 or 2 pairs of bytes per character, plus a 2-byte sequence at the beginning of the document to indicate whether the low or high byte comes first in each pair), but are in fact feeding it a document that is iso-8859-1 or windows-1252 (or any other non-BOM-using encoding) encoded. Most likely the cause of this is that your XML prolog contains an encoding="utf-16" declaration (or you've somehow told the XML parser externally that the document is utf-16), when in fact the document is actually iso-8859-1 or windows-1252 encoded. -Mike PS- It's "voilà" -- http://www.bartleby.com/61/81/V0138100.html XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|