[XQuery Talk Mailing List Archive Home] [By Date] [By Thread] [By Subject] [By Author] [Recent Entries] [Reply To This Message]

Requirements on Optimizers [was ANSWERS to "What's wrong with XQuery" question]

Michael Sokolov sokolov at ifactory.com
Sun Jul 25 15:05:12 PDT 2010


  Requirements on Optimizers [was ANSWERS to "What's
	wrong with XQuery" question]
On 7/25/2010 9:51 AM, Martin Probst wrote:
>> Part of the answer, I think, is to make performance less reliant on good
>> optimization. In XSLT, the key() function goes a long way towards this:
>> by giving programmers a tool to control when indexes are built and used,
>> performance of many join constructs becomes much more predictable
>> [...]
>>
>>      
> It might be nice to have language constructs saying "guarantee to me
> that you do this in O(something), otherwise fail".
>    

That's exactly right - for applications with sufficient scale, it's just 
not enough to know that a given expression will be evaluated correctly: 
it's also critical to understand whether indexed lookup will be applied 
so as to guarantee completion (or failure) before the universe ends. I 
heartily advocate tools (such as xslt's key()) that enable programmers 
to communicate this sort of requirement to the query evaluator.

This already does exist (in vendor-specific extensions) in various 
xquery database implementations that provide query extension functions 
which allow explicit invocation of index lookups, for both full-text 
queries, and also in some cases for typed value-based (range) queries. 
As an aside, a related feature that is critical is the ability for the 
query-writer to profile queries at a fine grain and/or have some 
visibility into the execution plan.

I'm new to this discussion, so I'm sure that I'm unaware of a lot of 
what has gone on, so please forgive me if I'm asking for something that 
already exists! Over the past few years I've been devoted to 
implementing large systems using xquery and have struggled at times with 
getting predictable performance, especially across multiple platforms, 
so I offer the wish list of an implementor.

It would be good idea to have some agreement on a standard for how to 
query using indexes, and possibly how to create them as well. It does 
seem to me that xslt key() is pretty close to getting it right: I might 
prefer to add a few features though :)

1) the ability to query the values of the index keys and their 
statistics (for faceting, distinct-values, max and the like); ideally 
could be restricted by an orthogonal sequence of  some sort (ie - given 
a sequence of nodes, how many have each key value?)
2) Specifying type and collation information for indexes (and queries) 
to enable range queries

I understand there is xquery full-text which may address some (or all ?) 
of my concerns. I hope you will forgive me for not being completely au 
courant regarding that: as an implementor, I'm focused on what's 
available now and in the near future.  But my quick read of that spec 
turns up only a few references to indexing, none of which seem normative.

-Mike



PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2007 All Rights Reserved.