[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Negative lookahead/behind in XSL regexp

Subject: Re: Negative lookahead/behind in XSL regexp
From: Wolfgang Laun <wolfgang.laun@xxxxxxxxx>
Date: Tue, 10 May 2011 08:56:01 +0200
Re:  Negative lookahead/behind in XSL regexp
A clumsy attempt to achieve the recognition of "real" keywords.
Improvements are certainly possible, e.g., putting it all into a
single function. The basic idea is to split the text according to the
elementary keyword pattern and then to look at the start or end of the
following or previous chunk, respectively; using the patterns for the
look*-assertions.

<?xml version="1.0"?>
 <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                 xmlns:xsd="http://www.w3.org/2001/XMLSchema"
                 xmlns:wl="w/l"
                 version="2.0">

  <xsl:output indent="yes"/>

  <xsl:function name="wl:keywords" as="xsd:string*">
    <xsl:param name="string" as="xsd:string"/>
    <xsl:param name="kwd"    as="xsd:string"/>
    <xsl:analyze-string regex="{$kwd}" select="$string">
      <xsl:matching-substring>
        <xsl:value-of select="."/>
      </xsl:matching-substring>
    </xsl:analyze-string>
  </xsl:function>

  <xsl:function name="wl:chunks" as="xsd:string*">
    <xsl:param name="string" as="xsd:string"/>
    <xsl:param name="kwd"    as="xsd:string"/>

    <xsl:analyze-string regex="{$kwd}" select="$string">
      <xsl:non-matching-substring>
        <xsl:value-of select="."/>
      </xsl:non-matching-substring>
    </xsl:analyze-string>
  </xsl:function>

  <xsl:template match="/">
    <xsl:apply-templates select="/*"/>
  </xsl:template>

  <xsl:template match="text">
   <xsl:variable name="keywords" as="xsd:string*"
                 select="wl:keywords(.,'[pqr]')"/>
   <xsl:variable name="chunks"   as="xsd:string*"
                 select="wl:chunks(.,'[pqr]')"/>

   <wl:res>
    <xsl:for-each select="$keywords">
      <xsl:variable name="i" select="position()"/>
      <xsl:variable name="before" as="xsd:string*"
                    select="$chunks[$i]"/>
      <xsl:variable name="after" as="xsd:string*"
                    select="$chunks[$i + 1]"/>
      <xsl:choose>
        <xsl:when test="matches($before,'[ab]$')">
        </xsl:when>
        <xsl:when test="matches($after,'^[xyz]')">
        </xsl:when>
        <xsl:otherwise>
          <wl:real-keyword>
            <xsl:value-of select="$keywords[$i]"/>
          </wl:real-keyword>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:for-each>

   </wl:res>
  </xsl:template>
</xsl:stylesheet>

When applied to

  <table>
    <text>p aq pz is q it p</text>
  </table>

the result is

 <wl:res xmlns:wl="w/l" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
   <wl:real-keyword>p</wl:real-keyword>
   <wl:real-keyword>q</wl:real-keyword>
   <wl:real-keyword>p</wl:real-keyword>
</wl:res>

-W


On 9 May 2011 10:41, Andrew Welch <andrew.j.welch@xxxxxxxxx> wrote:
>
> On 9 May 2011 09:32, Clint Redwood <clint@xxxxxxxxxxxxxxx> wrote:
> > Hi,
> >
> > I'm using regexp in XSL to process input text data into a selection of
fixed values. This works for most of my requirements but I have one key word
which I need to exclude if preceded by one word or followed by another. It
appears that lookahead and lookbehind are not included in the XSL2.0
specification, so I'm looking for an alternative way of doing it.
> >
> > The only way I've found so far involves lots of nested groups with negated
character classes and is very messy with long words.
> >
> > Anyone got a neat way of doing it?
>
> Hard to say without sample inputs and outputs... could you add another
> step after the regex processing and exclude the key word there?
>
>
> --
> Andrew Welch
> http://andrewjwelch.com

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.