[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: UTF-8, RTF and XSLT

Subject: Re: UTF-8, RTF and XSLT
From: Russell Kohn <russ@xxxxxxxxxxxx>
Date: Fri, 8 Nov 2002 07:53:26 -0800
rtf utf 8
At 9:27 AM +0000 11/8/02, David Carlisle wrote:
RTF isn't XML so you woukd be better using text output than xml, also if
you don't want non ascii characters encoded in utf8 then specify a
different encoding eg latin1, so...

<xsl:output method="xml" encoding="iso-8859-1"/>

Hi David,


Yes. However, I think I may have been unclear before, so let me try again...


My source XML looks something like this (i'm simplifying here):


<?xml version="1.0" encoding=-"UTF-8" ?>
<resultset>
<data>theData</data>
<data>someMoredata</data>
<data>youGetTheIdea</data>
</resultset>

Now, let's assume theData contains the single &Aring; (&#197) character, which should render as a capital A with a ring over the top.

If I open the raw XML file in a browser or other application that can read UTF-8 natively, this renders fine. If I open it in a pure text editor, then theData appears as two arbitrary characters and not a single &Aring;.

I do not have control over the raw XML file, so I can't change the way theData is encoded on the way in.

OK, now my XSLT file looks something like this (again simplifying):

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:myStuff="myURL" exclude-result-prefixes="myStuff"/>
<xsl:output method="text" version="1.0" encoding="theEncoding" indent="yes" omit-xml-declaration="yes"/>


<xsl:template match="myStuff:resultset">
<xsl:for-each select="mystuff:data">
   <xsl:text>a bunch of stuff</xsl:text>
   <xsl:value-of select="myStuff:data"/>
   <xsl:text>some more stuff</xsl:text>
</xsl:for-each>
</xsl:template>

</xsl:stylesheet>


I have tried setting theEndoding to various static values such as "UTF-8",
"iso-8859-1", "ISO-8859-1", other iso-8859-x variants and some other values. My xslt creates rich text files that need to be opened by various RTF readers. In every case I've tried, theData comes across as UTF-8 encoding, and is not coerced into a different encoding.


If I open my final output file in a plain text editor, I'll see the same arbitrary characters I had originally. If I open the output in a UTF-8 capable reader, the characters do render properly. Since most RTF readers are not going to read UTF-8, how can I get myData to convert from UTF-8 to something more digestible.

Since I'm creating Rich Text Format output, I would be happy to solve this problem within the rtf side of things; however, I'd much prefer to solve it by fixing my xslt stylesheet if possible.

TIA,

- Russ



Russell Kohn , Chaparral Software & Consulting Services Inc.
Calabasas, California -  http://www.chapsoft.com - 818.225.1247

XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list


Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.