[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Recognized Unicode characters?

Subject: Re: Recognized Unicode characters?
From: Geert Josten <Geert.Josten@xxxxxxxxxxx>
Date: Mon, 09 May 2005 15:48:46 +0200
square character
Hi,

Maybe your default font of your browser doesn't support the character you are trying to see. I cannot reproduce the problem. Using output method HTML, the XSL processor (Xalan) converts the 8212 to &mdash; when writing us-ascii and some utf-8 byte sequence when writing utf-8. I see either garbage (when it is utf-8 and there is no meta tag specifying the encoding) or just the character you are looking for. I saw the square box in none of the cases I tested...

Cheers,
Geert

Thanks for responding, but I think you guys lost me.
Here is the xslt header info I used:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html"/>


I set output to HTML because that is the output I am creating. (isn't this right?)

As for the encoding, I have to admit I am confused. I picked UTF-8 mostly due to general recommendations for its use in learning-xml books and websites (and that it is the default), but none I have seen explain why with any detail or why anyone might use something different. The special characters in my source xml file are all character references to the Unicode numbers (&#___;, etc.)

As I understand it, shouldn't the XSLT processor know from the "encoding" attribute that the references will be to Unicode numbers and read them correctly as those characters. I also understand that the processor has some flexibility in how it outputs the text, but that it will often output special characters as entity references (e.g., the "&" symbol as "&amp;").

So, I am still confused why a Unicode reference to #8212 won't output correctly? The ouput displays a square box in both the browser (IE6) as well as in the HTML source itself (viewed via Windows notepad).

> > > Shouldn't that be <xsl:output encoding="US-ASCII"... for safety?
> >
> > Neither is completely safe of course,
>

The spec only requires support for UTF-8 and UTF-16, anything else is
optional.

I personally use "iso-646" as the name of this encoding. The differences are
immaterial (different names for some of the characters, I believe) but I
prefer international standards as a matter of principle.


Michael Kay
http://www.saxonica.com/




-- ===================================== NB: het Daidalos kantoor is sinds 22 april jl. gevestigd op een nieuw adres:

Daidalos BV
Hoekeindsehof 1 - 4
2665 JZ Bleiswijk
tel: +31 (0)10 850 12 00
fax: +31 (0)10 850 11 99

Bovenstaand adres is tevens het postadres.
======================
Geert.Josten@xxxxxxxxxxx
IT-consultant at Daidalos BV

http://www.daidalos.nl/

GPG: 1024D/12DEBB50

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.