[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Problems with characters

Subject: Re: Problems with characters
From: Tony Graham <Tony.Graham@xxxxxxx>
Date: Wed, 20 Feb 2002 14:13:20 +0000
xsl not showing utf8 characters
Ragulf Pickaxe wrote at 20 Feb 2002 07:56:04 +0000:
 > The left character now depicts what I intended it to show but in the 
 > homepage it shows as:
 > [ &#281;  ] (&#x119; )  LATIN SMALL LETTER E WITH OGONEK
 > >>ø instead of ¸
 > The left chacter also suddently showed the correct character in this mail, 
 > but the homepage showed this as:
 > (It does not show what I see but I see the charater like u with same IGONEK 
 > as the former character or perhaps as greek character (the one used in 
 > measuring meaning 1e-6)).
 > >>å is depicted as å
 > This one is the only one showing the correct result all over (also on the 
 > HTML page).

You are mixing ISO 8859-1 (Latin1) and ISO 8859-13 (Latin7,
a.k.a. Baltic Rim).

See the code pages at http://www.czyborra.com/charsets/iso8859.html

You want to use:

&#xE5; LATIN SMALL LETTER A WITH RING ABOVE
&#xE6; LATIN SMALL LETTER AE
&#xF8; LATIN SMALL LETTER O WITH STROKE

Quoting from iso8859-13.txt from czyborra.com:

------------------------------------------------------------
=B8	U+00F8	LATIN SMALL LETTER O WITH STROKE
...
=BF	U+00E6	LATIN SMALL LETTER AE
...
=E5	U+00E5	LATIN SMALL LETTER A WITH RING ABOVE
=E6	U+0119	LATIN SMALL LETTER E WITH OGONEK
...
=F8	U+0173	LATIN SMALL LETTER U WITH OGONEK
------------------------------------------------------------

When you say that you see LATIN SMALL LETTER E WITH OGONEK and LATIN
SMALL LETTER U WITH OGONEK, you are (i.e. your software is)
interpreting your text as being in ISO 8859-13.  You are seeing the
ISO 8859-13 characters that are at the same code points as the
characters of interest are in ISO 8859-1.

When you send mail and most of the rest of the world sees &#xB8;,
CEDILLA, and &#xBF;, INVERTED QUESTION MARK, we are seeing the
characters of interest as encoded in ISO 8859-13 but we're
interpreting them as the ISO 8859-1 characters at those ISO 8859-13
code points.  Some aspect of how you composed your mail managed to map
the characters of interest to their ISO 8859-13 positions but, as
someone already noted, your email didn't indicate its encoding, so our
mail agents interpreted your email as ISO 8859-1 text.

This solves the mystery but doesn't solve your problem.

Is it possible that the intermittent correct display is because your
browser is trying to autodetect the encoding and failing?  Did the
comment that you deleted contain non-ASCII characters that may have
confused autodetection?  You may get consistent ISO 8859-1 results if
you manually select the character set/encoding in your browser.

If you are generating HTML, you can include a META element that
indicates the character set, which may help solve the problem.

You could also specify UTF-8 as the encoding for the output of your
two stylesheets, but that may cause more problems if the rest of your
software can't really handle UTF-8.

Regards,


Tony Graham
------------------------------------------------------------------------
XML Technology Center - Dublin                mailto:tony.graham@xxxxxxx
Sun Microsystems Ireland Ltd                       Phone: +353 1 8199708
Hamilton House, East Point Business Park, Dublin 3            x(70)19708

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.