[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Selective escaping of special characters

Subject: Selective escaping of special characters
From: "Kyrre Wathne" <kyrre@xxxxxxxx>
Date: Tue, 12 Mar 2002 14:21:53 +0100
kyrre wathne
My apologies if this question has been asked before, I haven't found posts
that address this exact issue.

My problem is that I want to transform junk HTML generated by Microsoft
Word. This contains markup, of course, so my first instinct was to use
disable-output-escaping. However, this also disables escaping of other
special characters, like the special dash character &#8211;. These are then
outputted in a format my browser (Internet Explorer) doesn't understand (I
use "ISO-8859-1" as encoding in output).

I did work out a fix (pasted below) using a recursive named template, but
this is proving too slow for all but the smallest documents. (I use Saxon
6.5.1.)

My question is then: is there a fast way to only disable escaping for "<",
">" and "&"? Alternatively, can the named template below be optimized
significantly?

Thanks for any help.

Kyrre Wathne



<!-- Named template to output markup while escaping special characters -->

<xsl:template name="DUMP_TAG_STRING">
  <xsl:param name="str"/>
  <xsl:choose>
  <xsl:when test="not($str)">
    <!-- Empty String -->
  </xsl:when>
  <xsl:when test="not(contains($str, '&lt;')) and not(contains($str,
'&gt;')) and not(contains($str, '&amp;'))">
    <!-- My work is done -->
    <xsl:value-of select="$str"/>
  </xsl:when>
  <xsl:otherwise>
      <!-- Convert all XML markup characters temporarily to the backspace
character -->
      <xsl:variable name="escaped" select="translate($str, '&lt;&gt;&amp;',
'&#9224;&#9224;&#9224;')"/>
      <xsl:variable name="cutPos" select="1 +
string-length(substring-before($escaped, '&#9224;'))"/>
      <!-- All but last letter -->
      <xsl:variable name="before" select="substring($str, 1, $cutPos - 1)"/>
      <!-- Last letter -->
      <xsl:variable name="replace" select="substring($str, $cutPos, 1)"/>
      <!-- Find the string after before -->
      <xsl:variable name="after" select="substring($str, $cutPos + 1)"/>
        <!-- Dump part before match -->
        <xsl:value-of select="$before"/>
        <!-- Dump &lt; or &gt; as is, unescaped -->
        <xsl:value-of select="$replace" disable-output-escaping="yes"/>
        <xsl:if test="$after">
        <!-- Recurse with remainder -->
        <xsl:call-template name="DUMP_TAG_STRING">
          <xsl:with-param name="str" select="$after"/>
        </xsl:call-template>
        </xsl:if>
    </xsl:otherwise>
    </xsl:choose>
</xsl:template>


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.