[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Missing byte-order mark problem

Subject: Re: Missing byte-order mark problem
From: Mike Brown <mike@xxxxxxxx>
Date: Sun, 3 Aug 2003 16:30:58 -0600 (MDT)
missing byte order mark
Vivek Shinde wrote:
> For last two days I was struggling with a problem of applying a 
> XSL stylesheet to XML that had Danish characters (using entities 
> like &#248; etc.). The output=HTML was working fine but when I 
> tried to get text output I kept getting "Missing byte-order mark".
> I tried it with encoding of UTF-8 as well as UTF-16, it did not work.
>  Finally I found a listing on google from this group from way back
> in 2002 http://www.xslt.com/xsl-list/2002-02/msg00675.html and it 
> suggested to use encoding="iso-8859-1" and walla...it worked.

Trial and error is not a very good way to go about document authoring
or XSLT programming.

In the prolog of an XML document, encoding="iso-8859-1" is an assertion that
the document's bytes map to Unicode characters according to the iso-8859-1
encoding. This declaration may be entirely false, as you may have saved the
document in UTF-8 or UTF-16 or some other format. It is required to be a
truthful statement, though, by the XML spec, so that an XML parser will know
how to interpret the bytes.

In the xsl:output instruction element, encoding="iso-8859-1" is there to
notify the XSLT processor that after it is done building the result tree,
you would like it to be serialized as bytes according to the iso-8859-1
encoding.

"Missing byte-order mark" indicates that your XML parser is trying to read a
document under the assumption that it is utf-16 encoded (1 or 2 pairs of bytes
per character, plus a 2-byte sequence at the beginning of the document to
indicate whether the low or high byte comes first in each pair), but are in
fact feeding it a document that is iso-8859-1 or windows-1252 (or any other
non-BOM-using encoding) encoded.

Most likely the cause of this is that your XML prolog contains an
encoding="utf-16" declaration (or you've somehow told the XML parser
externally that the document is utf-16), when in fact the document is actually
iso-8859-1 or windows-1252 encoded.

-Mike

PS- It's "voilà" -- http://www.bartleby.com/61/81/V0138100.html

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.