Re: why matches($title,'.*?(\.|,)\s*$')) can perform s
O > Another interesting article is this one describing some of the > optimizations performed by the regex engine in Google Chrome: > http://blog.chromium.org/2009/02/irregexp-google-chromes-new-regexp.html > > This mentions another trick used by some regex implementations. In > their example "Sun|Mon", their engine recognises that a match for this > expression always contains "n" in the third character, and so rather > than testing for a match at each index in the string (which was the > problem with the example given) they first scan the string to find "n" > characters and only try to apply the regex starting two characters > preceding one. This is similar to the Boyer-Moore algorithm that Perl (and probably pcre) also uses - originally it was just for constant strings. There was a fun race to get the fastest "grep" in the early 1990s. The difficulty with a lot of these optimizations is that capturing groups can make them harder - and in Perl people use capturing groups all the time of course. You can't turn (font|foot) into fo(nt|ot), and although you can turn it into (fo(?:nt|ot)) you may then lose the benefit. Since the article Mike Key mentioned was written in 2007, Perl's regexps seem to have got a lot faster. But even in the early 1990s I remember converting sed scripts into perl and getting huge speedup, even though Perl was using recursive back-tracking. The BM-style delta tables probably accounted for most of that. Liam -- Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/ Pictures from old books: http://fromoldbooks.org/
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format