[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: A super-efficient way to compute the sum of A[i] *

Subject: Re: A super-efficient way to compute the sum of A[i] * B[i] for i=1 to n?
From: "Liam R. E. Quin liam@xxxxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sat, 9 May 2020 17:40:08 -0000
Re:  A super-efficient way to compute the sum of A[i] *
On Sat, 2020-05-09 at 12:00 +0000, Costello, Roger L.
costello@xxxxxxxxx wrote:
> Hi Folks,
> 
> I need a super-efficient way to compute the sum of A[i] * B[i] for
> i=1 to n.

The most efficient way to do something in computing is often to avoid
doing it. So if it's a performance bottleneck, see if you can avoid it
altogether.

I did some timings (see below) but found overall that building the XML
tree dominated: a stylesheet that just produced a message took 15
elapsed seconds on a 700MByte input file, including JVM startup and
compiling the stylesheet; the computation took another 4 or 5 seconds.


Timings:

For what it's worth, i tried with slightly different XML -- a v element
having a and b children,
  <v><a>.438743</a><b>.4874343</b></v>
  <v>...

Saxon 9 EE took 19 seconds to process 10,000,000 v elements, of which
the first 1.5 seconds or so was JVM startup and stylesheet complilaion
(based on a test file with only 3 v elements).

    <xsl:template match="/">
	<r>
	  <xsl:message select="'count: ' || count(//v)"/>
	  <xsl:value-of select="sum( //v ! my:pair(./a, ./b))" />
	</r>
    </xsl:template>

    <xsl:function name="my:pair">
	<xsl:param name="a" as="xs:double" />
	<xsl:param name="b" as="xs:double" />
	<xsl:sequence select="$a * $b"/>
    </xsl:function>

If the element names were longer, it'd take more time (my test file was
half a gigabyte).

Eliminating the function and using
   <xsl:value-of select="sum( //v ! (./a * ./b) )" />
was about the same time or a second slower, BUT the first time i had a
typo, a , instead of a *, and this was not an error. The function
version is more robust against errors.

Using (./a treat as xs:double) * ./b treat as xs:double) produced a
runtime error after 14.5 seconds (using an XML schema would have
removed the error) showing that constructing the sequence of v elemnts
is taking most of the time.

Using cast instead of treat (xs:facepalm) worked and took 19 to 21
seconds (running it multiple times).

Changing //v to /s/v made a small improvement (18 or 19 seconds instead
of 20).

Using a template to match v elements and return a * b, storing tha tin
a variable of tytpe xs:double*, and returning sum() on it, was about
the same speed.

Using <double-value> instead of <v> to have a 763 MByte file took only
slighty longer.

I did notice, however, that the process was getting about twice as many
CPU seconds as elapsed seconds, showing evidence of multi-threading.

For what it's worth i tried constructing two sequences, instead of
reading elements from an XML file; it took about 5 or 6 seconds, or,
with the same input file as before, about 18 seconds - in other words
it's not the XML part that's taking time.


Liam

-- 
Liam Quin, https://www.delightfulcomputing.com/
Available for XML/Document/Information Architecture/XSLT/
XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
Barefoot Web-slave, antique illustrations:  http://www.fromoldbooks.org

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.