General comparisons of speed of xquery vs. xsltMichael Kay mhk at mhk.me.uk
Wed Apr 28 21:36:34 PDT 2004
It's true that dataflow analysis is difficult if the stylesheet makes heavy use of template rules - though there are processors that attempt it, for example from conversations with Jacek Ambrosiak I think Gregor does a lot of this kind of thing, and I suspect that datapower's XSLT engine does as well, though it's a commercial product with no details published. Of course I could defend my thesis on one level by saying that the stylesheet author doesn't have to use template rules. Almost every XQuery construct has a direct equivalent in XSLT (there are some trivial exceptions, for example the capabilities of "order by" are slightly different) and if the XSLT author confines himself to the subset of XSLT that is available in XQuery, then it's obviously true that an XSLT processor can apply exactly the same optimizations. However, I think most of the optimizations that are generally useful can be achieved in XSLT without imposing this constraint. For example, the vast majority of cases where path expressions don't need to be sorted can be detected statically without great difficulty. Also, one shouldn't underestimate the potential of optimizations where decisions are deferred until run-time, which means they can be made with the benefit of instance information that's not available statically. XSLT's "2 language" architecture does make global optimization a bit difficult, because the tendency is to engineer the two language implementations as separate components each of which is handled independently. That's not a stopper, however, and I think the higher performance engines depart from this model. And in some ways it makes life easier, because the run-time architecture for the two languages can be significantly different. Streaming in XSLT 1.0 was actually easier than in 2.0, some of the new facilities we have introduced (which mimic the way XQuery specifies tree construction) make it more difficult. In 1.0 XPath evaluation typically used a "pull" stream while XSLT instruction evaluation used a "push" pipeline, and this worked very well. In XQuery the natural technique is always to pull, which means you end up doing more tree construction and copying than is necessary. The optimizer has to do a lot of work to prevent this. The non-composability of XSLT (XSLT instructions can't be nested inside XPath expressions) gives it a great advantage here, which still exists in 2.0, although it's not so clean if people make heavy use of function calls rather than templates. Generally I think that if there is a speed difference between processors (for example MSXML4's XSLT processor is often reported to be three or four times faster than the XSLT processor in .NET) then the difference seems to apply fairly uniformly across a wide range of stylesheets. This suggests to me that the speed difference is not primarily due to the smartness of the optimizer, which would give very variable ratios for different stylesheets. Rather, it has a lot to do with the general tuning of the code and in particular the efficiency of memory management. For many single-document transformations the performance is dominated by source document parsing, tree construction, sorting, and serialization, (and stylesheet compilation if you count that in) and no amount of optimization is going to make a big impact on that. Michael Kay _____ From: Daniela Florescu [mailto:http://xquery.com/mailman/listinfo/talk] Sent: 28 April 2004 18:16 To: Michael Kay Cc: 'Edward Gillespie'; http://xquery.com/mailman/listinfo/talk Subject: Re: General comparisons of speed of xquery vs. xslt (b) with in-memory transformations there is no intrinsic reason why XQuery should be faster than XSLT Michael, I think I disagree with this statement. This might be due to the fact that I understand XQuery much better then I do understand XSLT, but here is my rationale anyway. Most XQuery code rewriting rules that we apply in the BEA implementation require serious dataflow analysis (i.e. how is the data flowing through expressions, where is the data coming from and where is it going), similar in spirit with the way all modern compilers do. Trivial examples of code rewriting rules that require dataflow analysis are eliminating the unnecessary sorts and duplicate elimination, transforming backwards navigation into forward navigation, introducing parallelism and asyncronicity, etc, but there are many, many others. Moreover, we are building a streaming XQuery engine. Of course, not all queries can be executed in a purely streaming fashion. We use the same dataflow analysis to detect and minimize the need for materialization, which is essential for query performance. Now it seems to me that this dataflow analysis is easier to do in XQuery (through expressions) then in XSLT (through templates). Knowing XSLT much better then I do, what is your take on this? Are there any XSLT implementations that do dataflow analysis for optimization ? Best regards, Dana P.S. A while ago we wrote a paper describing our streaming XQuery implementation . Daniela Florescu, Chris Hillery, Donald Kossmann, Paul Lucas, Fabio Riccardi, Till Westmann, Michael J. Carey, Arvind Sundararajan, Geetika Agrawal: The BEA/XQRL Streaming XQuery Processor. VLDB 2003: 997-1008 http://www-dbs.informatik.uni-heidelberg.de/publications/index.shtml A better version will appear in VLDB Journal soon. And by the way, in this paper we did compare our XQuery implementation with an XSLT implementation. While doing so, we did translate XMark in XQuery. If there is some demand, we can spend some time, polish those queries and publish them in an open forum somewhere. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://xquery.com/pipermail/talk/attachments/20040428/d5194839/attachment.htm
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format