[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: suppression of the transformation of character entities in


html ansi
From: "S Woodside" <sbwoodside@y...>

> Probably you are specifying the output to be encoded in UTF-8 or 
> something like that where the character is supported in the encoding. 

I don't think so.  The data goes wrong coming into the XML processor. The character 
references are supposed to be for various kinds of quotes, but the numbers are not the 
Unicode Numbers. If the characters get though, it will only be by accident.  If the output 
encoding is set to UTF-8, for example, then &#146; will produce two bytes.  

(The case where it will *seem to* work is if the output encoding passes throught the C1  
characters to the same bytes: for example a ISO8559-1 transcoder. Then if the output is 
then read using CP1252 the characters will come out.) 

Bad systems are easy. Fragile, slack, and out of control. Better to make the 
character reference be for the correct Unicode characters so that the XML coming
in is correct. Then make sure the XML coming out is correct.  

Also avoid debugging character encodings of generated HTML using a browser: they 
can guess or do all sorts of things (depending on the generation, brand and settings): use 
any hex or text editor that lets you select encodings or which understands the XML 
encoding header. Using a browser to figure out what is happening with encodings is
the surest road to insanity. 

To see what the character references should be, see
    http://www.alanwood.net/demos/ansi.html
Instead of the numbers in the "ANSI" column, use the (decimal) numbers in the
"Unicode" column. 

Cheers
Rick Jelliffe


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.