[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Regular expression functions (Was: Re: comments on

Subject: RE: Regular expression functions (Was: Re: comments on December F&O draft)
From: "Marc Portier" <mpo@xxxxxxxxxxxxxxxx>
Date: Fri, 11 Jan 2002 00:49:00 +0100
regular expression exponent
Hi Jeni,

> -----Original Message-----
> From: Jeni Tennison [mailto:jeni@xxxxxxxxxxxxxxxx]
> Sent: donderdag 10 januari 2002 14:05
> To: Marc Portier
> Cc: Steven Noels; xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: Re: Regular expression functions (Was: Re:  comments on
> December F&O draft)
> Hi Marc,
> > some
> > <regex name="fancy-number">[0-9]+(\.[0-9]+)?([Ee][+-][0-9]+)?</regex>
> >
> > could then later be used inside
> > <matcher name="" regex="(other groups):fancy-number:(other groups)">
> > ... while nested matchers or output-selecting elements could
> then use group
> > selections like
> > 1.      <...    select-group="1"> ... or 2 refering to counting
> the parenthesis in
> > the scoped regex of this matcher
> > 2.      <... select-group=":fancy-number:2" >
> > </matcher>
> >
> > could be challenging to implement (spontanous idea of using the
> > indexes as offsets in counting parenthesis)
> I like this method better than the Omnimark method of assigning the
> names within the regular expression itself, because it doesn't clutter
> the regular expression (if anything it makes it more readable) and it
> allows regular expressions to be reused.

> There are a couple of issues that would need to be worked out with it,
> though. What happens if you have a regular expression that involved
> two instances of the named subexpression at the same level:
>   <matcher name="two-numbers" regexp=":fancy-number:\w:fancy-number:">
>     ...
>   </matcher>
> You need to have separate indexes to indicate which one you're talking
> about, plus some kind of syntax to pull out submatches within the
> named subexpression. Borrowing from XPath syntax (which might be a bad
> idea), you might have:
>   fancy-number[2]/*[2]
jep, had short internet-time juste before I left with sending this reply, it
crossed my mind later,
that indeed double reuse of one regex inside another one could occur, nice
to see there is already a syntax inside the world of xslt-awares that would
help out.

> to indicate the second subexpression of the second fancy-number
> subexpression in the matched string.
trying to catch it completely though:

you mean:
the *[index] is throwing all named subregexes on one array and getting the
second regardless it's name, right?

getting an actual parenthesis group out of a named subregex would be
different, no?
example of the nuance I'm seeing: how would I select the exponent-group out
of the second matched fancy-number in the folowing setting?

no sub-subregex's only parenthesis groups
<regex name="fancy-number">[0-9]+(\.[0-9]+)?([Ee][+-][0-9]+)?</regex>
<matcher name="two-numbers" regexp=":fancy-number:\w:fancy-number:">

compared to:
<regex name="exponent">[Ee][+-][0-9]+</regex>
<regex name="fractalpart">\.[0-9]+</regex>
<regex name="fancy-number">[0-9]+:fractalpart:?:exponent:?</regex>
<matcher name="two-numbers" regexp=":fancy-number:\w:fancy-number:">
or	select-group="fancy-number[2]/exponent"

> Actually, that syntax isn't all that bad - you can imagine the matcher
> actually builds up a tree structure based on the subexpression
yep, need some more imagination before actually building it though :-)

> matches (you need 'anonymous' elements for unnamed subexpressions, but
> you should be able to get away with that using elements in some
> restricted namespace or something)...
mmm... don't understand how we could get unnamed subexpressions?
as far as I see now, we'ld need :name: to slice them in, no?

> > this also makes me think about your earlier mentioning of dynamic
> > regexes you probably expect anything that qualifies as a
> > text-representing xsl parameter to be possibly carrying part of the
> > regex to be executed...
> I think that if you could build the named regular expressions
> dynamically, then this idea would work fine. Going back to the keyword
> example that I used on an earlier mail, you could do:
> <xsl:regexp name="keyword-as-word"
>             select="concat('\W', $keyword, '\W')" />
> If named regular expressions were like variables, you could assign
> them values at the global or local level...

> Cheers,
> Jeni
> ---
> Jeni Tennison
> http://www.jenitennison.com/

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Current Thread


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.