[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Statistics - Calculating Standard Deviation

Subject: Re: Statistics - Calculating Standard Deviation
From: "Dimitre Novatchev" <dnovatchev@xxxxxxxxx>
Date: Fri, 13 Jun 2003 19:17:33 +0200
calculating standard deviation
"Andrew Welch" <AWelch@xxxxxxxxxxxxxxx> wrote in message
news:3BAAB77DB787FC4C961601D815DAF1E50E6C41@xxxxxxxxxxxxxxxxxxxxxxxx
> > The performance is the thing that is worrying me most.  Ideally the
> > target processor is MSXML 4.0, but that is open to negotiation...
>
> Well using saxon 7.x (use the latest) and exslt/math you could use the
following simple stylesheet.  Im just wondering how much > of this can be
done using straight xslt 2 now... Is there a square root function? I had a
quick look but didn?t see anything.
>

The solution I posted earlier today runs OK without any modifications in
XSLT 2.0 (Saxon 7.5):

http://aspn.activestate.com/ASPN/Mail/Message/XSL-List/1670297

>
>
>
> <?xml version="1.0"?>
> <xsl:stylesheet version="1.0"
>   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>   xmlns:exsl="http://exslt.org/math">
>
> <xsl:variable name="mean" select="sum(/root/node) div count(/root/node)"/>
>
> <xsl:variable name="diffs">
>   <root>
>     <xsl:for-each select="/root/node">
>       <node squaredDiff="{exsl:power($mean - .,2)}">


Why is this necessary? Probably multiplying a number with itself in pure
XSLT will not be slower?


>          <xsl:copy-of select="."/>
>       </node>
>     </xsl:for-each>
>   </root>
> </xsl:variable>
>
> <xsl:variable name="mean.Of.Sum.Of.Diffs">
>   <xsl:for-each select="$diffs">
>     <xsl:value-of select="sum(/root/node/@squaredDiff) div (count
(/root/node)-1)"/>
>   </xsl:for-each>
> </xsl:variable>
>
> <xsl:template match="/">
>   standard deviation: <xsl:value-of
select="exsl:sqrt(number($mean.Of.Sum.Of.Diffs))"/>
> </xsl:template>
>
> </xsl:stylesheet>


This solution will use 2 * N units of memory, which may be limiting its
applicability especially when processing long node-sets.
It may require from three to five traversals of a node-set with the length N
of the initial node-set (one each for sum() and count())

An advantage (in efficiency) is that it does not require any recursion.

However, I guess it would be much more efficient if sequences were
used/built instead of node-sets.


=====
Cheers,

Dimitre Novatchev.
http://fxsl.sourceforge.net/ -- the home of FXSL




 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.