[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Regular expression functions (Was: Re: comments on

Subject: RE: Regular expression functions (Was: Re: comments on December F&O draft)
From: "Michael Kay" <michael.h.kay@xxxxxxxxxxxx>
Date: Fri, 11 Jan 2002 14:11:31 -0000
RE: Regular expression functions (Was: Re:  comments on
> You mean, assuming that current-match() returned the node tree
> described in the mail, if I did:
>
>   current-match()/mantissa == current-match()/mantissa
>
> would the result be true or false? Or if I did:
>
>   match($string1, $regexp1) == match($string2, $regexp2)
>
> would the result be true or false?

Yes, that is the question.
>
> I think that in both cases returning different trees would be more
> consistent, since user-defined functions won't have the luxury of
> being able to reuse trees.

The argument for returning the same tree is the same as with the document()
function. It means it can be safely optimized by pulling it out of a loop or
by eliminating common sub-expressions.
>
> > I guess one could say that it's explicitly implementation-defined,
> > and no-one would worry too much about it. But it's also something
> > you want to avoid if at all possible because constructing new trees
> > is always expensive.
>
> Is that because constructing *nodes* is expensive or is it the *links*
> between the nodes within a tree that makes things problematic?

Both. Creating objects with identity is expensive in most languages, it
involves memory-allocation overheads. The need to support all the axes makes
the objects quite heavyweight.

> If the
> latter, then perhaps documentless nodes are a blessing ;)

Not if it means they have to be copied by physical cloning!

> If the
> former, then it's a good argument for nested sequences, so you don't
> have to create nodes to provide structure.

Yes, there are some good arguments for nested sequences. But let's not go
there, we want to get this thing finished.

In the case of the regular expression functionality you are trying to
define, I've been trying to follow the arguments but haven't reached any
particular views on what the right answer is. I don't have much personal
experience of languages that use regexps heavily, which doesn't help. It
might be that a solution similar to xsl:for-each-group is needed. This was
constrained by the fact that we couldn't model a set of groups directly in
the data model, so instead we defined an instruction to iterate over the set
of groups, presenting one group at a time to the application, and making
that group available through the magic function current-group(). I sort of
feel an xsl:for-each-string-match might work similarly, but I can't
articulate the details yet. Keep working at it, guys.

Mike Kay


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.