Re: why matches($title,'.*?(\.

[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: why matches($title,'.*?(\.|,)\s*$')) can perform s

Subject: Re: why matches($title,'.*?(\.|,)\s*$')) can perform so much worse than matches($title,'(\.|,)\s*$'))
From: Oliver Hallam <oliver@xxxxxxxxxxx>
Date: Wed, 13 Jul 2011 18:43:55 +0100
Re:  why matches($title
That is interesting.  I was aware that there are some very smart regex
engines out there, but wasn't aware that they had made it to any
XQuery/XSLT processors yet.

Another interesting article is this one describing some of the
optimizations performed by the regex engine in Google Chrome:
http://blog.chromium.org/2009/02/irregexp-google-chromes-new-regexp.html

This mentions another trick used by some regex implementations.  In
their example "Sun|Mon", their engine recognises that a match for this
expression always contains "n" in the third character, and so rather
than testing for a match at each index in the string (which was the
problem with the example given) they first scan the string to find "n"
characters and only try to apply the regex starting two characters
preceding one.  I would not be at all suprised if they recognized that a
regex beginning .* needs only be applied to the first character.

Oliver
XQSharp


On 13/07/2011 15:13, Michael Kay wrote:

It would be perfectly valid (and sensible) for a query processor to realise that the two expressions you gave were equivalent and so not perform n^2 tests, but I am unaware of a processor that makes these kinds of optimizations to regular expressions.

Actually I've heard it said that there's a wide variation between different regex engines in how well they handle this kind of thing. See for example here:


http://swtch.com/~rsc/regexp/regexp1.html

The article at

http://eyalsch.wordpress.com/2009/05/21/regex/

is also useful.

Michael Kay
Saxonica

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.