[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Speeding up processing (with sablotron or saxon)

Subject: Re: Speeding up processing (with sablotron or saxon)
From: "TDarksword" <tdarksword@xxxxxxxxxxxx>
Date: Tue, 13 Jul 2004 15:57:44 +0100
performance effects of xsl key
----- Original Message ----- 
From: "Wendell Piez" <wapiez@xxxxxxxxxxxxxxxx>
To: <xsl-list@xxxxxxxxxxxxxxxxxxxxxx>
Sent: Tuesday, July 13, 2004 12:03 AM
Subject: Re:  Speeding up processing (with sablotron or saxon)


> Hi,
>
> At 01:33 PM 7/12/2004, you wrote:
> >ok I have a piece of XSLT that processes a large XML file into smaller
> >chunks. The problem I have is that the deeper down into the XML file I am
> >processing the longer it takes. Is this just due to the way XSLT parsers
> >work or can I tweak my XSL file so it processes faster?
> >
> >I get the same effect when I used to process the file as one pass using
> >Saxon Result:document as I do processing as seperate XSL files with
either
> >Saxon or Sablotron.
> >
> >
> >This is the seperate file XSL file:- (Change the server[@name='Ahazi'] as
> >needed)
> ><?xml version="1.0"?>
> ><xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
> >version="1.0">
> ><xsl:output method="xml" indent='yes' encoding="utf-8"/>
> >
> ><xsl:template match="server" />
> ><xsl:template match="server[@name='Ahazi']">
> ><resources>
> ><xsl:for-each
> >select=".//resource[not(@swgcraft_id=preceding::*/@swgcraft_id)]">
>
> ... this for-each is expensive. You are traversing the entire document
> looking for 'resource' elements; each one you find is examined by looking
> at all its preceding elements and comparing their @swgcraft_id attributes.
> When you have lots of elements, lots and lots of them are compared. (n^2
> performance.)
>
> Since this happens every time the template is matched (which could itself
> be lots of times), it adds up -- especially for the later nodes in your
set
> (as you noticed).
>
> An easy tweak to improve performance would be to use keys to de-duplicate
> instead of doing it by hand on the preceding:: axis.
>
> So:
>
> <xsl:key name="resource-by-id" match="resource" use="@swgcraft_id"/>
>
> <xsl:variable name="resources" select="//resource"/>
> (binding //resource to a variable $resource so we don't have to retrieve
it
> every single time)
>
> then you can deduplicate in another variable declaration:
>
> <xsl:variable name="unique-resources"
>     select="$resources[not(count(.|key('resources-by-id',@swgcraft_id)[1])
> = 1)]"/>
>
> In English: $unique-resources is the collection of all resources which,
> when counted along with the first resource with the same swqcraft_id as
> themselves, amount to a single node (which is true only of the first one
> with each swgcraft_id).
>
> This ought to help quite a bit.
>
> Cheers,
> Wendell
>

So I'd replace the:-
<xsl:for-each
select=".//resource[not(@swgcraft_id=preceding::*/@swgcraft_id)]">

with

<xsl:key name="resource-by-id" match="resource" use="@swgcraft_id"/>
<xsl:variable name="resources" select="//resource"/>
<xsl:variable name="unique-resources"
     select="$resources[not(count(.|key('resources-by-id',@swgcraft_id)[1])
= 1)]"/>

but I guess I still need some form of for-each statement too?

TIA Tony

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.