|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: XPath 1.5? (was RE: typing and markup)
> > I think that optimization of // is a more compelling way to > use knowledge > of complex types. Suppose you have a pattern like this: > > //address > > Without knowledge of the complex types involved, this > requires examination > of all elements in the document to see if they are "address" > elements. > Looking at the schema for a particular invoice document, it > is easy to see > that the above pattern can only match shipping or billing > addresses found > in customers. The optimizer can rewrite the above pattern as follows: > > /customer/billing/address | /customer/shipping/address > > In at least some environments, this will be much more > efficient to execute. Yes, //address makes a very good case study. But the answer isn't clear cut. The above rewrite is one way of optimizing it. Another way is to use an index. Here there is a real difference between XSLT and XQuery: with XQuery, the documents are typically built and indexed long before the query is written, so as with a relational database, indexing decisions are made by database designers based on guesswork about the future query workload. With XSLT, the stylesheet is usually compiled before the source document(s) are parsed and turned into trees, so the stylesheet can direct that indexes should be built to support the access paths it wants to be fast. One idea I have played with in Saxon is to build a mini-schema for the document as it is being parsed (essentially an A-contains-B graph) - this could potentially be more useful than the actual schema since it describes what is actually present in the document, rather than what is permitted to be present. What Saxon actually does with //address is to compute it from a full document scan the first time it is used on a particular document, then to save the results for subsequent occasions: a sort of "just-in-time index". I think it's high time dynamic indexing ideas for persistent databases were revisited, most of the work I've seen dates from the 1970s. All this goes to prove that there is scope for plenty of PhD theses on XPath/XQuery optimization. I think it's obvious that there are optimizations that can be done with knowledge of the schema that can't be done without, but exactly what those optimizations are is still a matter for research. Michael Kay Software AG home: Michael.H.Kay@n... work: Michael.Kay@s...
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








