[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Translating character entities for plain text outp

Subject: RE: Translating character entities for plain text output
From: "Michael Kay" <michael.h.kay@xxxxxxxxxxxx>
Date: Wed, 17 Jul 2002 19:14:19 +0100
character entity translation
> I've found a lot of discussion in the archives about solving 
> character entity problems for HTML output, but not much on plain text:
> 
> Generating plain text from docbook via XSLT, I need to output

What do you mean by "plain text"? Specifically, what character encoding?
If you live in the US or Western Europe, chances are you want
iso-8859-1: so specify encoding="iso-8859-1" in the xsl:output
declaration.
 
> a space for &nbsp; and -- for &mdash;.  I can get some funny 
> glyphs (like  for &nbsp;) and various literal codes, but not 
> the result I want. I could postprocess the output, but I'd 
> love to fix the style sheet.

Yes, you can do these conversions either in the stylesheet or by
postprocessing. Or if you want to be clever, you could do it at input
time: change the entity definitions so that &nbsp; means " " and &mdash;
means "--".

XSLT outputs bytes, not glyphs. The  glyph for &nbsp; was created by
the software you used to view the bytes. In this case the XSLT processor
was outputting a UTF-8 encoding of the character, and you were viewing
it using software that thought it was looking at iso-8859-1.
> 
> In the stylesheet, I've tried defining the entity in a local 
> subset

It's irrelevant how the stylesheet defines the entity, the XML parser
looks ofr the entity definitions in the source document.

, also html and text methods and various encodings in 
> the xsl:output. The following almost works:
> 
> <xsl:template match="text()">
>      <xsl:if test="contains(.,'&#160;')">
>          <xsl:value-of select="translate(., '&#160;', ' ')"/>
>      </xsl:if>
> </xsl:template>
> 
> Unfortunately, this seems to suppress another essential 
> translation on the same context:
> 
>      <xsl:value-of select="translate(., '&#xA;&#xD;', ' ')"/> 
> 
> I can do either, but not both.

You can do both by writing translate(., '&#160;&#xA;&#xD;', '  ')

But actually, there probably won't be any &#xD; characters in your
source: they are removed by the XML parser. Your translation works by
accident, I suspect, because it converts an &#xA to a space and an &#xD
to nothing.

Michael Kay
Software AG
home: Michael.H.Kay@xxxxxxxxxxxx
work: Michael.Kay@xxxxxxxxxxxxxx 


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.