[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Regular expression functions (Was: Re: comments on

Subject: Re: Regular expression functions (Was: Re: comments on December F&O draft)
From: Jeni Tennison <jeni@xxxxxxxxxxxxxxxx>
Date: Sun, 13 Jan 2002 00:33:56 +0000
match h regular expression
Hi David,

>> Honestly, I can't see much difference between having to handle this
>> syntax and having to handle:
> ah that's easy I just use d-o-e for that:-)

Quite. But (I think) you would recommend to people who presented you
with that kind of format, that changing the source XML was a better
way than trying to design a stylesheet to do it. Did I get it right,
by the way? ;)

> I'm not convinced yet actually, I think it duplicates too much
> existing xslt functionality. It's clear that lex/yacc accepts a
> larger class of grammars than regular expressions (I see Dimitre's
> already supplied the details), but I think (somehow) the extra
> functionality (which basically just always comes down to nesting,
> counting and storing information for look-ahead) is already present
> in xslt so the trick is to add regexp support (only) in a way that
> the extra arithmetic functionality can be pulled from xpath/xslt for
> those occasions when you need it. Since we argued a long time ago
> that xslt was turing complete (given an approximation to an infinite
> tape) everything's possible anyway so it's only a matter of
> convenience.

Well of course it's all just a matter of convenience :) This seemed a
very convenient way, to me, of parsing what you'd described. I'd
rather describe the grammar than describe the parsing process, if you
see what I mean. Perhaps it's all the BNF in the WDs affecting me...

The really light-weight method is a simple match() function that
returns start/length pairs of integers. If you have that, then
assuming that templates were allowed to match simple typed values, you
can parse your string with something like:

<xsl:template match="value of type xs:string" mode="row">
  <xsl:variable name="match" select="match(., '^\\([a-z]+)\{')" />
    <xsl:when test="$match[1] = 1">
      <xsl:variable name="name"
                    select="substring(., $match[3], $match[4])" />
      <start name="{$name}" />
      <xsl:apply-templates select="substring(., $match[2])"
                           mode="row" />
      <xsl:apply-templates select="." mode="expr" />

... and so on in a way that it's really too late at night to detail.
You can manually keep track of brackets with parameters and so on.

I imagine that this is similar to what you're doing at the moment,
just that the regexp rather than substring() et al. might make some of
your life easier.

Actually, I think that the \frac{...}{...} construct is fairly
difficult to handle, but like you say, anything can be done.

I'll think some more...

> given a lex implementation you could (I think, haven't checked)
> implement an Xpath parser if you wished...

Already got an XPath parser. Parsing is the easy part of evaluate() -
the evaluation is the hard part (to implement in XSLT).



Jeni Tennison

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Current Thread


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.