Re: why matches($title,'.*?(\.

[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: why matches($title,'.*?(\.|,)\s*$')) can perform s

Subject: Re: why matches($title,'.*?(\.|,)\s*$')) can perform so much worse than matches($title,'(\.|,)\s*$'))
From: Liam R E Quin <liam@xxxxxx>
Date: Wed, 13 Jul 2011 19:12:30 -0400
Re:  why matches($title
O
> Another interesting article is this one describing some of the
> optimizations performed by the regex engine in Google Chrome:
> http://blog.chromium.org/2009/02/irregexp-google-chromes-new-regexp.html
> 
> This mentions another trick used by some regex implementations.  In
> their example "Sun|Mon", their engine recognises that a match for this
> expression always contains "n" in the third character, and so rather
> than testing for a match at each index in the string (which was the
> problem with the example given) they first scan the string to find "n"
> characters and only try to apply the regex starting two characters
> preceding one.

This is similar to the Boyer-Moore algorithm that Perl (and probably
pcre) also uses - originally it was just for constant strings. There was
a fun race to get the fastest "grep" in the early 1990s.

The difficulty with a lot of these optimizations is that capturing
groups can make them harder - and in Perl people use capturing groups
all the time of course. You can't turn (font|foot) into fo(nt|ot), and
although you can turn it into (fo(?:nt|ot)) you may then lose the
benefit.

Since the article Mike Key mentioned was written in 2007, Perl's regexps
seem to have got a lot faster. But even in the early 1990s I remember
converting sed scripts into perl and getting huge speedup, even though
Perl was using recursive back-tracking. The BM-style delta tables
probably accounted for most of that.

Liam


-- 
Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://fromoldbooks.org/

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.