[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Special entity characters in Shift-JIS XSL.

Subject: Re: Special entity characters in Shift-JIS XSL.
From: "Nikolai Grigoriev" <grig@xxxxxxx>
Date: Fri, 17 Dec 1999 04:39:41 +0300
utf 8 shift jis
David Carlisle wrote:


>which spec? there is nothing that could be put into the xsl spec, as
>what you are asking for is a change in XML 1.0, this is why your
>suggested markup of using &# syntax will always be fragile and flaky.
>As soon as your documents are touched by any xml parser the characters
>may (or may not) be written out as character data in the document
>encoding rather than as character references, since the xml spec makes
>it explicit that these are equivalent when used in element character
>data.


I have much the same problems with Russian texts as Sean O'Dell
has with Shift-JIS. For Russian, there exist two major 8-bit encoding
schemes plus two minor ones; UTF-8 is scarcely used because it
doubles the length of the text. Surely enough, none of the 8-bit Russian
encoding is supported by currently available XSLT processors. Well,
I can change the encoding declaration to ISO-8859-1 and let the whole
text be parsed correctly. But outputting the processing results as UTF-8
is dramatic: what I get is "KOI8-r converted to UTF-8 as if it were
Latin-1",
too strong for poor me.

I admire James Clark's XT, but I can hardly use it for Russian - because
there's no way to make it output anything but UTF-8. Fortunately,
there is SAXON that supports Latin-1 in the output, and lets me pass
my weird letters through ;-); thanks to Mike Kay!

I think a universal solution would be a proper support for US-ASCII
output encoding. This would quote to numeric entities all characters
but the 7-bit ones - exactly what Sean need. This is often a
preferred solution for non-Latin-1 encodings that can hardly be
supported by common tools in the nearest future. It's a pity that XML
spec does not enforce this as a conformance criterion.

SAXON kinda does it: it issues a message that US-ASCII encoding
is not supported and threatens to switch to UTF-8, but still prints all
special characters as numeric entities. Thanks again Mike ;-).

Regards,
Nikolai



 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.