[xsl] Re: Statistics - Calculating Standard Deviation

Cart

XML Editor - Download a Free Trial >

See What's New >

Buy Now >

[Home] [By Thread] [By Date] [Recent Entries]

Subject: Re: Statistics - Calculating Standard Deviation
From: "Dimitre Novatchev" <dnovatchev@xxxxxxxxx>
Date: Fri, 13 Jun 2003 19:17:33 +0200

"Andrew Welch" <AWelch@xxxxxxxxxxxxxxx> wrote in message
news:3BAAB77DB787FC4C961601D815DAF1E50E6C41@xxxxxxxxxxxxxxxxxxxxxxxx
> > The performance is the thing that is worrying me most.  Ideally the
> > target processor is MSXML 4.0, but that is open to negotiation...
>
> Well using saxon 7.x (use the latest) and exslt/math you could use the
following simple stylesheet.  Im just wondering how much > of this can be
done using straight xslt 2 now... Is there a square root function? I had a
quick look but didn?t see anything.
>

The solution I posted earlier today runs OK without any modifications in
XSLT 2.0 (Saxon 7.5):

http://aspn.activestate.com/ASPN/Mail/Message/XSL-List/1670297

>
>
>
> <?xml version="1.0"?>
> <xsl:stylesheet version="1.0"
>   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>   xmlns:exsl="http://exslt.org/math">
>
> <xsl:variable name="mean" select="sum(/root/node) div count(/root/node)"/>
>
> <xsl:variable name="diffs">
>   <root>
>     <xsl:for-each select="/root/node">
>       <node squaredDiff="{exsl:power($mean - .,2)}">


Why is this necessary? Probably multiplying a number with itself in pure
XSLT will not be slower?


>          <xsl:copy-of select="."/>
>       </node>
>     </xsl:for-each>
>   </root>
> </xsl:variable>
>
> <xsl:variable name="mean.Of.Sum.Of.Diffs">
>   <xsl:for-each select="$diffs">
>     <xsl:value-of select="sum(/root/node/@squaredDiff) div (count
(/root/node)-1)"/>
>   </xsl:for-each>
> </xsl:variable>
>
> <xsl:template match="/">
>   standard deviation: <xsl:value-of
select="exsl:sqrt(number($mean.Of.Sum.Of.Diffs))"/>
> </xsl:template>
>
> </xsl:stylesheet>


This solution will use 2 * N units of memory, which may be limiting its
applicability especially when processing long node-sets.
It may require from three to five traversals of a node-set with the length N
of the initial node-set (one each for sum() and count())

An advantage (in efficiency) is that it does not require any recursion.

However, I guess it would be much more efficient if sequences were
used/built instead of node-sets.


=====
Cheers,

Dimitre Novatchev.
http://fxsl.sourceforge.net/ -- the home of FXSL




 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Current Thread

Re: Statistics - Calculating Standard Deviation, (continued)
- Dimitre Novatchev - Thu, 12 Jun 2003 05:58:02 -0400 (EDT)
- Dimitre Novatchev - Thu, 12 Jun 2003 11:56:05 -0400 (EDT)
- Hugh Dixon - Thu, 12 Jun 2003 20:53:49 -0400 (EDT)
- Andrew Welch - Fri, 13 Jun 2003 09:46:59 -0400 (EDT)
  - Dimitre Novatchev - Fri, 13 Jun 2003 13:25:53 -0400 (EDT) <=
- Jim Fuller - Fri, 13 Jun 2003 10:49:57 -0400 (EDT)

<- Previous	Index	Next ->
RE: Statistics - Calculating , Andrew Welch	Thread	RE: Statistics - Calculating , Jim Fuller
RE: Muenchian Grouping with P, McDonald, Thomas	Date	RE: call apply-templates to a, Lars Huttar
	Month

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >