[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: use-when attribute?

Subject: RE: use-when attribute?
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Sun, 19 Dec 2004 22:52:23 -0000
combine node sets
> One stylesheet defines a global variable select="//a|//b|//c" 
> and another similarly a key (or three 
> with the same name) with match="a|b|c" use="'all'"..
> 
> The root template outputs a root node containing the count of 
> either the global variable (test case 
> 1) or of the key (test case 2).

Consider count(//a|//b|//c).

A completely naive implementation will build three node-sets in memory, the
result of //a, //b, and //c respectively, will then combine these into a
single node-set in memory, sorting and eliminating duplicates as it goes,
and will then count the number of nodes in the combined node-set.

But there are many fairly easy optimizations possible:

(a) the expression //a|//b|//c can be rewritten as //*[self::a or self::b or
self::c]. This then performs a single scan of the document, with no need to
eliminate duplicates

(b) count() doesn't need to materialize the node-set in memory, it can count
the nodes as they are found (this is called pipelining)

(c) the system can recognize that expressions such as //a deliver results in
sorted order without doing an explicit sort, and the union operation (|)
between such node-sets can be done by a merge operation without any need for
a sort.

(d) A union operation can be pipelined: there is no need to materialize //a
and //b in memory in order to form their union.

(e) The system can recognize that //a and //b are disjoint, so
count(//a|//b) is the same as count(//a) + count(//b)

In short, it isn't hard for a system to evaluate this expression in a single
scan, or perhaps three scans, and there is no need to allocate any memory
for temporary results.

Even if the result of //a|//b|//c is assigned to a variable and you then do
count() on the variable, the system may be able to tell that it doesn't need
to allocate memory to hold the value (it might be able to see that it's only
used once).

Now consider the solution using keys. If you define a key using xsl:key, and
then use it in the key() function, you're declaring a fairly clear intent
that you want to build some kind of index to make repeated calls on key()
faster. Building an index is an expensive operation: it's only worth doing
if you are going to use it repeatedly. If you only use it once, it will
almost certainly be slower than doing the search directly. It's unlikely
that the system will notice that it's not worth building the index, because
xsl:key is intended as an explicit performance hint and the system would
usually assume that if the user says they want an index, they mean it.
> 
> Despite the explanation from Mike, I still don't _exactly_ 
> understand why the key solution is so 
> much slower in this case (it doesn't fit with other 
> experiences with keys in which a reached a large 
> gain by using them), but it is at least very obvious that it 
> is _not_ wise to use a key with a fixed 
> use pattern...
> 

The main reason the key is slower in this case is that you are only using it
once. But there's no good reason for ever using a key with a fixed use
expression more than once, because you could always put the result of the
first evaluation in a variable. So yes, I can't see any reason for using
this construct.

Michael Kay
http://www.saxonica.com/

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.