[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] XQuery: is FLWR a <xsl:foreach/> ?
At 06:38 AM 2/23/2001 -0800, Evan Lenz wrote: >Jonathan Robie wrote: > > > To a database person, it is somewhat surprising that your > > paper does not explicitly mention joins, which are one of > > the biggest reasons for FLWR expressions in XQuery. Joins > > are central to database functionality, and it is important > > to express them in a way that allows optimization based on > > patterns detected in the expressions. I also notice that the > > examples in your paper do not include any examples from > > Section 3 of the XQuery paper, which shows how conventional > > SQL-like queries are done. > >That's because Section 3 does not introduce any new query functionality. >Using joins over an XML view of a relational database is just another use >of the FLWR expression. The XSLT mappings to these are just as determinate >as the rest. I think this is a central difference between our views. One of the reasons for FLWR expressions is that there is an extensive literature on optimizing these kinds of expressions in SQL, OQL, and various tree-structured languages that are related to these first two languages. Although it may be possible to perform similar optimizations on XSLT, this clearly falls in the "future research" category. A fundamentally important issue: how does a query optimizer recognize patterns in the query, correlate them with information about the schema and the data, and rewrite the query in ways that can be proven to be equivalent and perform much faster? For instance, suppose I have the following XQuery: FOR $i IN //invoice, $p IN distinct($i//product) WHERE $i/customer = "ACME", $p/name = "screwdriver" RETURN <product_ordered> $p, $i/date </product_ordered> The query optimizer should be able to see that the WHERE conditions can be lifted up into the XPath: FOR $i IN //invoice[customer="ACME"][.//product/name="screwdriver"], $p IN distinct($i//product) RETURN <product_ordered> $p, $i/date </product_ordered> Now the query optimizer can look to see whether a datastore has an index on customer or on product name. Perhaps the indexes also have the quantities of the items. If there are tens of thousands of invoices for ACME, but only one invoice for a screwdriver, then it will act differently than it would if there were tends of thousands of invoices for screwdrivers, but only one for ACME. At any rate, my own knowledge of query optimization is not deep, so I don't want to play the expert here. Guido Moerkotte has written an excellent survey on query optimization techniques which you can access here: http://pi3.informatik.uni-mannheim.de/staff/mitarbeiter/moer/querycompiler.ps If you want us to use XSLT syntax directly in favor of our FLWR expressions, I need to know the answers to questions like these: 1. What are the equivalences that can be exploited for query optimization? 2. What are the typing rules for the possible <xsl:foreach/> constructs? 3. How are the various possible <xsl:foreach/> constructs translated into SQL? (fill in your favorite environment in place of SQL) Do you know of any good work in these areas? Please don't ask me to do it myself, or ask for proof that it can not be done. If we want a solution in a reasonable amount of time, we should build on work that exists. Jonathan There are also aspects of XQuery optimization that fall solidly into the "future research" category. These are my opinions right now. They may be quite different from the opinions of Software AG, the W3C XML Query Working Group, or the opinions that I will have after reading and considering your response.
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|