[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: encoding woes: ISO-8859-1 vs. UTF-8
> I am confused with the recent behavior described > following regarding encoding. I have a string "oLogo" > in CSV, with those two weird characters actually being > “ and ”, characters in General Punctuation > II. > Here is the steps I am going through, consistently > using ISO-8859-1 for encoding: You can't be using ISO-8859-1 to encode the characters “ and ” ISO-8859-1 can only encode the characters in the range 0-255. Perhaps you were using some proprietary Microsoft 8-bit encoding that includes these two characters? Rather than showing us what the CSV file looks like on your screen (which depends entirely on the software used to display it) it might help to show us what it looks like in hex. > A. Import CSV > 1. convert CSV to generic XML: the string did not > change, stayed "oLogo". > 2. saxon convert generic XML to proprietary XML: > string got converted to "“Log”"; > 3. import successful This looks as if everything is OK so far, although the original CSV file can't have been in iso-8859-1 as you claim. > B. Export into CSV > 1. pull from MSSQL7 to proprietary XML: "oLogo" > 2. saxon convert proprietary XML to CSV: exception > org.xml.sax.SAXException: Output character not > available in this encoding (decimal 8220) > Why going one way it works and not the other? When you use <xsl:output method="text" encoding="iso-8859-1"/> you can only output the characters available in iso-8859-1, namely the XML characters in the range 0-255. Michael Kay Software AG home: Michael.H.Kay@xxxxxxxxxxxx work: Michael.Kay@xxxxxxxxxxxxxx XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|