[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: XSL performance question: running count of attri

Subject: Re: XSL performance question: running count of attributes using axes and sum()
From: mark bordelon <markcbordelon@xxxxxxxxx>
Date: Thu, 9 Apr 2009 13:46:24 -0700 (PDT)
Re:  XSL performance question: running count of   attri
Ken,

Since your solution appeared to be the least invasive, I tried your improved
axis right away and noticed an increase in performance by a factor of eight. I
had to make use of your second variant, since syl nodes can be children of
line as well as children of word.

<xsl:template match="syl">

<xsl:variable name="current_quantity"><xsl:value-of select="sum( (
preceding-sibling::syl | ../preceding-sibling::syl |
preceding-sibling::word/syl | ../preceding-sibling::word/syl ) [ not( @elide =
'true' ) ]/ floor(@length) )" /></xsl:variable>

<xsl:variable name="color"><xsl:choose><xsl:when test="@length=2 and
($current_quantity mod 4 =
0)">background-color:#EEFFEE;</xsl:when></xsl:choose></xsl:variable>
....

Your explanation of the floor() function syntax problem was particularly
helpful and made a 1.0 solution even more urgent for me. I don't suppose we
couldn't continue this discussion from there? REALLY would like to make this a
1.0 transform.

I will try the solutions of the other responders next.

Thanks again.

Mark

--- On Thu, 4/9/09, G. Ken Holman <gkholman@xxxxxxxxxxxxxxxxxxxx> wrote:


From: G. Ken Holman <gkholman@xxxxxxxxxxxxxxxxxxxx>
Subject: Re:  XSL performance question: running count of attributes using
axes and sum()
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Date: Thursday, April 9, 2009, 12:32 PM


(retrying send due to problems; apologies if this is a duplicate)

At 2009-04-09 11:22 -0700, mark bordelon wrote:
> In transforming the <syl> tags below into HTML table cells to display them,
I need to format each cell with a green color with the running total of the
@length attributes is a multiple of four. Ideally having the ability to do
running totals in another variable would be great, but not the best XSL-esque
solution, so I am using axes instead.

But I think your axis approach could be improved.

> I have tried solutions with count and sum, but performance is slow: 756
lines like the ones below mean thousands of syllables to check, each with its
own axis computation -- the complete xform takes more than an hour!
>
> Can anyone point me to a solution that is more performant yet still
elegant/simple?
>
> An aside: it seems that ceiling() is an Xpath1.0 function, but oddly enough
not floor()

No, you are writing an XPath 2.0 expression that violates XPath 1.0 syntax
.... floor() is supported in XPath 1.0:

  http://www.w3.org/TR/1999/REC-xpath-19991116#function-floor

You have written:

   node-set-expression/floor(attribute)

which is acceptable in XPath 2.0 but not in XPath 1.0 where you need to
write:

   floor(node-set-expression/attribute)

>  -- Altova SPY complains about floor until I change the stylesheet to
version 2.0 (sigh).

Yes, because of your expression, not because of the function.

Now, you are using that inside of a sum() expression ... which means you
cannot use XPath 1.0 because you are applying the floor() function to each
argument within the sum().

> I would love this to transform in XSL1.0 if possible, and rounding down each
length to the integer is essential to acheive the correct formatting result.

Then it is going to be an elaborate solution because the argument to sum() in
XPath 1.0 can only be a node set, or a variable, not the result of applying a
function to each member of the node set.

I'll continue with XPath 2.0.

> Thanks in advance for any help on this.

The preceding axis has syllables all the way to the start of the document ....
I suspect you need only work with siblings to get the performance you need.

> XML:
>
> <poem>
> ...
> </poem>
>
> XSL template:
>
> <xsl:template match="syl">
>
> <xsl:variable name="line_id"><xsl:value-of
select="node()/ancestor::line/@id" /></xsl:variable>

It also wastes time to assign variables and use them, though the processor may
optimize this.

> <xsl:variable name="current_quantity">

Assigning text values to temporary trees and converting them later to numbers
is very (very!) inefficient.  You should just assign the value to the
variable.

> <xsl:value-of select="sum(preceding::syl[ancestor::line/@id = $line_id and
(not(@elide) or @elide='false') ]/floor(@length))" /></xsl:variable>

<xsl:variable name="current_quantity" select="expression"/>

The preceding axis is notoriously slow because you will be going to *all*
syllables all the way to the start of the document.

Did you mean to have the <syl> for "que" between words as in your first line? 
That looks out of place and it really adds a lot to the expressions.

If not:

sum( ( preceding-sibling::syl | ../preceding-sibling::word/syl )
     [ not( @elide = 'true' ) ]/
     floor(@length) )

If so:

sum( ( preceding-sibling::syl | ../preceding-sibling::syl |
       preceding-sibling::word/syl | ../preceding-sibling::word/syl )
     [ not( @elide = 'true' ) ]/
     floor(@length) )

Note that I've replaced " not(@elide) or @elide='false' " with "not(
@elide='true' )" because @elide='true' will be false if there is no @elide
attribute, so not(@elide='true') will be true if there is no @elide attribute
or if @elide='false'.  I am assuming there are no other values for @elide.

Since I'm using XSLT 2.0, that could also be done as:

sum( ( ancestor::line//syl[. << current()] )
     [ not( @elide = 'true' ) ]/
     floor(@length) )

.... but I can't comment on performance and would be interested to hear what
you experience with your large data set.

> <xsl:variable name="color"><xsl:choose><xsl:when test="@length=2 and
($current_quantity mod 4 =
0)">background-color:#EEFFEE;</xsl:when></xsl:choose></xsl:variable>
>
> <td style="{$color}"><xsl:value-of select="text()" /></td>

I think it is a bad habit to address text nodes explicitly, and I think you
should be using "." instead of "text()".  Not everyone feels that way.

> </xsl:template>

You didn't post a working stylesheet, so it took time to rewrite your code for
illustration.  I've run the rewritten one below and then suggested what I
think would be faster performing.

I hope this helps.  It would take too long to volunteer to write the recursive
loop of floor() to each argument before the sum to show this in XSLT 1.0.

.. . . . . . . . . . . Ken


t:\ftemp>type bordelon.xml
<poem>
        <line id="1">
                <word id="1">
                        <syl length="2">Ar</syl>
                        <syl length="1">ma</syl>
                </word>
                <word id="2">
                        <syl length="1">vi</syl>
                        <syl length="2">rum</syl>
                </word>
                <syl length="1">que</syl>
                <word id="3">
                        <syl length="1">ca</syl>
                        <syl length="2">no</syl>
                </word> ,
                <word id="4">
                        <syl length="2">Tro</syl>
                        <syl length="2">iae</syl>
                </word>
                <word id="5">
                        <syl length="2">qui</syl>
                </word>
                <word id="6">
                        <syl length="2">pri</syl>
                        <syl length="1">mus</syl>
                </word>
                <word id="7">
                        <syl length="1">ab</syl>
                </word>
                <word id="8">
                        <syl length="2">o</syl>
                        <syl length="2">ris</syl>
                </word>
        </line>
        <line id="2">
                <word>
                        <syl length="2">li</syl>
                        <syl length="1.5">to</syl>
                        <syl length="1">ra</syl>
                </word> ,
                <word id="15">
                        <syl length="2">mul</syl>
                        <syl elide="true" length="1">tum</syl>
                </word>
                <word id="16">
                        <syl length="2">il</syl>
                        <syl elide="true" length="1">le</syl>
                </word>
                <word id="17">
                        <syl length="2">et</syl>
                </word>
                <word id="18">
                        <syl length="2">ter</syl>
                        <syl length="2">ris</syl>
                </word>
                <word id="19">
                        <syl length="2">iac</syl>
                        <syl length="2">ta</syl>
                        <syl length="1">tus</syl>
                </word>
                <word id="20">
                        <syl length="1">et</syl>
                </word>
                <word id="21">
                        <syl length="2">al</syl>
                        <syl length="2">to</syl>
                </word>
        </line>
</poem>


t:\ftemp>call xslt2 bordelon.xml bordelon.xsl

SYL: Ar SUM: 0 COLOR: GREEN
SYL: ma SUM: 2 COLOR:
SYL: vi SUM: 3 COLOR:
SYL: rum SUM: 4 COLOR: GREEN
SYL: que SUM: 6 COLOR:
SYL: ca SUM: 7 COLOR:
SYL: no SUM: 8 COLOR: GREEN
SYL: Tro SUM: 10 COLOR:
SYL: iae SUM: 12 COLOR: GREEN
SYL: qui SUM: 14 COLOR:
SYL: pri SUM: 16 COLOR: GREEN
SYL: mus SUM: 18 COLOR:
SYL: ab SUM: 19 COLOR:
SYL: o SUM: 20 COLOR: GREEN
SYL: ris SUM: 22 COLOR:
SYL: li SUM: 0 COLOR: GREEN
SYL: to SUM: 2 COLOR:
SYL: ra SUM: 3 COLOR:
SYL: mul SUM: 4 COLOR: GREEN
SYL: tum SUM: 6 COLOR:
SYL: il SUM: 6 COLOR:
SYL: le SUM: 8 COLOR:
SYL: et SUM: 8 COLOR: GREEN
SYL: ter SUM: 10 COLOR:
SYL: ris SUM: 12 COLOR: GREEN
SYL: iac SUM: 14 COLOR:
SYL: ta SUM: 16 COLOR: GREEN
SYL: tus SUM: 18 COLOR:
SYL: et SUM: 19 COLOR:
SYL: al SUM: 20 COLOR: GREEN
SYL: to SUM: 22 COLOR:
t:\ftemp>call xslt2 bordelon.xml bordelon-new.xsl

SYL: Ar SUM: 0 COLOR: GREEN
SYL: ma SUM: 2 COLOR:
SYL: vi SUM: 3 COLOR:
SYL: rum SUM: 4 COLOR: GREEN
SYL: que SUM: 6 COLOR:
SYL: ca SUM: 7 COLOR:
SYL: no SUM: 8 COLOR: GREEN
SYL: Tro SUM: 10 COLOR:
SYL: iae SUM: 12 COLOR: GREEN
SYL: qui SUM: 14 COLOR:
SYL: pri SUM: 16 COLOR: GREEN
SYL: mus SUM: 18 COLOR:
SYL: ab SUM: 19 COLOR:
SYL: o SUM: 20 COLOR: GREEN
SYL: ris SUM: 22 COLOR:
SYL: li SUM: 0 COLOR: GREEN
SYL: to SUM: 2 COLOR:
SYL: ra SUM: 3 COLOR:
SYL: mul SUM: 4 COLOR: GREEN
SYL: tum SUM: 6 COLOR:
SYL: il SUM: 6 COLOR:
SYL: le SUM: 8 COLOR:
SYL: et SUM: 8 COLOR: GREEN
SYL: ter SUM: 10 COLOR:
SYL: ris SUM: 12 COLOR: GREEN
SYL: iac SUM: 14 COLOR:
SYL: ta SUM: 16 COLOR: GREEN
SYL: tus SUM: 18 COLOR:
SYL: et SUM: 19 COLOR:
SYL: al SUM: 20 COLOR: GREEN
SYL: to SUM: 22 COLOR:
t:\ftemp>type bordelon.xsl
<?xml version="1.0" encoding="US-ASCII"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="2.0">

<xsl:output method="text"/>

<xsl:template match="syl">
<xsl:variable name="line_id"><xsl:value-of select="node()/ancestor::line/@id"
/></xsl:variable>

<xsl:variable name="current_quantity"><xsl:value-of
select="sum(preceding::syl[ancestor::line/@id = $line_id and (not(@elide) or
@elide='false') ]/floor(@length))" /></xsl:variable>

<xsl:variable name="color"><xsl:choose><xsl:when test="@length=2 and
($current_quantity mod 4 = 0)">GREEN</xsl:when></xsl:choose></xsl:variable>

<xsl:text/>
SYL: <xsl:value-of select="."/> SUM: <xsl:value-of
select="$current_quantity"/> COLOR: <xsl:value-of select="$color"/>

</xsl:template>

<xsl:template match="text()"/>

</xsl:stylesheet>

t:\ftemp>type bordelon-new.xsl
<?xml version="1.0" encoding="US-ASCII"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="2.0">

<xsl:output method="text"/>

<xsl:template match="syl">

<xsl:variable name="current_quantity"
              select="sum( ( preceding-sibling::syl |
                             ../preceding-sibling::syl |
                             preceding-sibling::word/syl |
                             ../preceding-sibling::word/syl )
                           [ not( @elide = 'true' ) ]/
                           floor(@length) )"/>

<xsl:variable name="color"
              select="if( @length=2 and
                          $current_quantity mod 4 = 0 ) then 'GREEN' else
''"/>

<xsl:text/>
SYL: <xsl:value-of select="."/> SUM: <xsl:value-of
select="$current_quantity"/> COLOR: <xsl:value-of select="$color"/>

</xsl:template>

<xsl:template match="text()"/>

</xsl:stylesheet>

t:\ftemp>type bordelon-new2.xsl
<?xml version="1.0" encoding="US-ASCII"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="2.0">

<xsl:output method="text"/>

<xsl:template match="syl">

<xsl:variable name="current_quantity"
              select="sum( ( ancestor::line//syl[. &lt;&lt; current()]  )
                           [ not( @elide = 'true' ) ]/
                           floor(@length) )"/>

<xsl:variable name="color"
              select="if( @length=2 and
                          $current_quantity mod 4 = 0 ) then 'GREEN' else
''"/>

<xsl:text/>
SYL: <xsl:value-of select="."/> SUM: <xsl:value-of
select="$current_quantity"/> COLOR: <xsl:value-of select="$color"/>

</xsl:template>

<xsl:template match="text()"/>

</xsl:stylesheet>

t:\ftemp>rem Done!


--
XSLT/XSL-FO/XQuery training in Los Angeles (New dates!) 2009-06-08
Training tools: Comprehensive interactive XSLT/XPath 1.0/2.0 video
Video lesson:    http://www.youtube.com/watch?v=PrNjJCh7Ppg&fmt=18
Video overview:  http://www.youtube.com/watch?v=VTiodiij6gE&fmt=18
G. Ken Holman                 mailto:gkholman@xxxxxxxxxxxxxxxxxxxx
Crane Softwrights Ltd.          http://www.CraneSoftwrights.com/s/
Male Cancer Awareness Nov'07  http://www.CraneSoftwrights.com/s/bc
Legal business disclaimers:  http://www.CraneSoftwrights.com/legal

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.