[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: XPath 1.5? (was RE: typing and markup)


xpath indexing
>
> I think that optimization of // is a more compelling way to
> use knowledge
> of complex types. Suppose you have a pattern like this:
>
>          //address
>
> Without knowledge of the complex types involved, this
> requires examination
> of all elements in the document to see if they are "address"
> elements.
> Looking at the schema for a particular invoice document, it
> is easy to see
> that the above pattern can only match shipping or billing
> addresses found
> in customers. The optimizer can rewrite the above pattern as follows:
>
>          /customer/billing/address | /customer/shipping/address
>
> In at least some environments, this will be much more
> efficient to execute.

Yes, //address makes a very good case study. But the answer isn't clear cut.
The above rewrite is one way of optimizing it. Another way is to use an
index. Here there is a real difference between XSLT and XQuery: with XQuery,
the documents are typically built and indexed long before the query is
written, so as with a relational database, indexing decisions are made by
database designers based on guesswork about the future query workload. With
XSLT, the stylesheet is usually compiled before the source document(s) are
parsed and turned into trees, so the stylesheet can direct that indexes
should be built to support the access paths it wants to be fast. One idea I
have played with in Saxon is to build a mini-schema for the document as it
is being parsed (essentially an A-contains-B graph) - this could potentially
be more useful than the actual schema since it describes what is actually
present in the document, rather than what is permitted to be present.

What Saxon actually does with //address is to compute it from a full
document scan the first time it is used on a particular document, then to
save the results for subsequent occasions: a sort of "just-in-time index". I
think it's high time dynamic indexing ideas for persistent databases were
revisited, most of the work I've seen dates from the 1970s.

All this goes to prove that there is scope for plenty of PhD theses on
XPath/XQuery optimization. I think it's obvious that there are optimizations
that can be done with knowledge of the schema that can't be done without,
but exactly what those optimizations are is still a matter for research.

Michael Kay
Software AG
home: Michael.H.Kay@n...
work: Michael.Kay@s...


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.