|
[XQuery Talk Mailing List Archive Home] [By Date] [By Thread] [By Subject] [By Author] [Recent Entries] [Reply To This Message] Regular Expression searchJohn Snelson jsnelson at sleepycat.comFri Dec 16 10:45:11 PST 2005
Martin Probst wrote: >>>Are you sure? It's probably possible for simple cases (e.g. "Foo|Bar"), >>>but for general regular expressions? How would you do that? >> >>Sleepycat's Berkeley DB XML has what it calls a "substring" index which >>is currently used to optimise fn:contains(), fn:starts-with() and >>fn:ends-with(). This works by splitting the content down into sequential >>three character segments, ie: >> >>"abccccb" is split into "abc", "bcc", "ccc", "ccb" >> >>This type of index could be used to optimise regular expression. If you >>define a regular expression to match the string above, it might look >>like this: >> >>"abc+b" >> >> From this regular expression, you can see that the keys you need to >>look up in the container are: >> >>"abc" & ("bcb" | ("bcc" & "ccb") | ("bcc" & "ccc" & "ccb")) > > Interesting. Though of course the real general case for Regular > Expressions is probably just out of reach. Yes. This scheme is never going to be able to optimise regular expressions with less than three sequential non-wilcard characters in it. However I would suggest that quite often this sort of index optimisation will be useful to real world applications. John -- John Snelson, Berkeley DB XML Engineer Sleepycat Software, Inc http://www.sleepycat.com Contracted to Sleepycat through Parthenon Computing Ltd http://blog.parthcomp.com/dbxml
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|






