[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Newbie encoding query

Subject: Re: Newbie encoding query
From: Mike Brown <mike@xxxxxxxx>
Date: Wed, 4 Dec 2002 00:24:49 -0700 (MST)
utf 8 trademark
Satish, L. Gnanendra wrote:
> The UserManual.XSL has a parameters which has to have a trademark
> symbol(#153):

*Some* HTML user agents allow one to illegally use &#153; to refer to
codepoint 153 of the windows-1252 encoding, but this is wrong for two reasons:

1. The number in a numeric character reference in XML or HTML is, by
definition, a character's Unicode codepoint. Unicode code point 153
corresponds to a legacy control character: SINGLE GRAPHIC CHARACTER INTRODUCER
(SGCI), which is not what you want.

2. Although not enforced, HTML's SGML declaration disallows Unicode characters
in the range 127-159, in addition to those that are disallowed by XML. You
cannot have them in a conforming HTML document, not even by reference.

&#8482; is the trademark symbol. You must use that in your XML and XSLT.
Do not use &#153;.

> <xsl:output method="html" encoding="UTF-8"/>

> My problem is that, when it is viewed in a IE6 browser, the parameter "GUI"
> displays:
> User Interface of PrismaÂ(tm) which is not it should be. I want to eliminate
> "Â" char from the html view. how do i go about this?

You asked for UTF-8 output. If you represent the Unicode character #153 in 
UTF-8, you get 2 bytes: <C2 99>. If you then view this output in an 
environment that does not recognize UTF-8, those bytes will be displayed as
characters from some other encoding. In your case, they are being mistakenly 
assumed to be windows-1252 bytes.

Really, you wanted Unicode character #8482, which in UTF-8 is 3 bytes:
<E2 C4 A2> which is going to look like an even uglier mess until you
correct the other problem: your web browser does not know that the
HTML is UTF-8 encoded.

Your XSLT processor should have added <meta http-equiv="Content-Type"
content="text/html;charset=UTF-8"> to the <head> of your HTML output. This
meta tag will tell your browser that the document's bytes are UTF-8 encoded
characters.

I suspect that your XSLT processor did not do this because you did not put a
<head> in your document, which is an HTML error anyway. Fix that. All HTML 
documents require a head, title and body:

<html>
  <head>
    <title>...</title>
  </head>
  <body>
    ...
  </body>
</html>


Mike

-- 
  Mike J. Brown   |  http://skew.org/~mike/resume/
  Denver, CO, USA |  http://skew.org/xml/

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.