[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: XSLT to remove characters and whitespaces

Subject: Re: XSLT to remove characters and whitespaces
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Fri, 07 Jul 2006 11:08:41 -0400
xslt stripping characters
Hi Georg,

A couple of things:

I'm unsure of why you are normalizing the spaces after converting CRs and tabs to spaces, and stripping line feeds, with translate() (in two separate operations). Why not simply normalize the spaces, since that takes care of line feeds and tabs? (The parser should already have normalized CRs away so they shouldn't even be there.)

More basically, and this is what accounts for your problem: you are matching elements, creating new elements with the same names (any reason not to use the simpler xsl:copy instruction?), writing out their string values (i.e. all the text inside the elements) and then descending the tree to do the same. This results in your string values being written out over and over again, every time an ancestor element gets processed.

So if your input were

  <text>Hey, how are tricks?</text>

you'll get

<greeting>GeorgXSL-ListHey, how are tricks?
  <text>Hey, how are tricks?</text>

since the greeting element gets its text value copied before its own element contents are traversed.

Instead of this, you only want to normalize values of the *text* nodes, letting element nodes take care of themselves ... so:

<xsl:template match="text()"/>
  <xsl:value-of select="normalize-space()"/>

<xsl:template match="*">
    <xsl:copy-of select="@*"/>

... as you can see, fairly simple, and a garden-variety near-identity transform.


At 06:33 AM 7/7/2006, you wrote:

i have a xml file with some content in it which contains some unwanted
carriage returns and whitespaces. Now I'm trying to write a stylesheet
which makes an exact copy of the source file but without the returns
and whitespaces. I thought this should work:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output name="stripped" method="xml" version="1.0"
encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:result-document format="stripped" href="result.xml">
<xsl:template match="*">
<xsl:element name="{name()}">
<xsl:value-of select="normalize-space(translate(translate(.,
'&#x0d;&#x0a;', ' '), '&#09;', ' '))"/>

But the output is a mess in parts. What am I doing wrong?

Current Thread


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.