Re: A super-efficient way to compute the sum of A[i] *

Play the video

Subject: Re: A super-efficient way to compute the sum of A[i] * B[i] for i=1 to n?
From: "Liam R. E. Quin liam@xxxxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sat, 9 May 2020 17:40:08 -0000

On Sat, 2020-05-09 at 12:00 +0000, Costello, Roger L.
costello@xxxxxxxxx wrote:
> Hi Folks,
> 
> I need a super-efficient way to compute the sum of A[i] * B[i] for
> i=1 to n.

The most efficient way to do something in computing is often to avoid
doing it. So if it's a performance bottleneck, see if you can avoid it
altogether.

I did some timings (see below) but found overall that building the XML
tree dominated: a stylesheet that just produced a message took 15
elapsed seconds on a 700MByte input file, including JVM startup and
compiling the stylesheet; the computation took another 4 or 5 seconds.

Timings:

For what it's worth, i tried with slightly different XML -- a v element
having a and b children,
  <v><a>.438743</a><b>.4874343</b></v>
  <v>...

Saxon 9 EE took 19 seconds to process 10,000,000 v elements, of which
the first 1.5 seconds or so was JVM startup and stylesheet complilaion
(based on a test file with only 3 v elements).

    <xsl:template match="/">
	<r>
	  <xsl:message select="'count: ' || count(//v)"/>
	  <xsl:value-of select="sum( //v ! my:pair(./a, ./b))" />
	</r>
    </xsl:template>

    <xsl:function name="my:pair">
	<xsl:param name="a" as="xs:double" />
	<xsl:param name="b" as="xs:double" />
	<xsl:sequence select="$a * $b"/>
    </xsl:function>

If the element names were longer, it'd take more time (my test file was
half a gigabyte).

Eliminating the function and using
   <xsl:value-of select="sum( //v ! (./a * ./b) )" />
was about the same time or a second slower, BUT the first time i had a
typo, a , instead of a *, and this was not an error. The function
version is more robust against errors.

Using (./a treat as xs:double) * ./b treat as xs:double) produced a
runtime error after 14.5 seconds (using an XML schema would have
removed the error) showing that constructing the sequence of v elemnts
is taking most of the time.

Using cast instead of treat (xs:facepalm) worked and took 19 to 21
seconds (running it multiple times).

Changing //v to /s/v made a small improvement (18 or 19 seconds instead
of 20).

Using a template to match v elements and return a * b, storing tha tin
a variable of tytpe xs:double*, and returning sum() on it, was about
the same speed.

Using <double-value> instead of <v> to have a 763 MByte file took only
slighty longer.

I did notice, however, that the process was getting about twice as many
CPU seconds as elapsed seconds, showing evidence of multi-threading.

For what it's worth i tried constructing two sequences, instead of
reading elements from an XML file; it took about 5 or 6 seconds, or,
with the same input file as before, about 18 seconds - in other words
it's not the XML part that's taking time.

Liam

-- 
Liam Quin, https://www.delightfulcomputing.com/
Available for XML/Document/Information Architecture/XSLT/
XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
Barefoot Web-slave, antique illustrations:  http://www.fromoldbooks.org

Current Thread

A super-efficient way to compute the sum of A[i] * B[i] for i=1 to n?
- Costello, Roger L. costello@xxxxxxxxx - 9 May 2020 11:59:16 -0000
  - Martin Honnen martin.honnen@xxxxxx - 9 May 2020 12:18:48 -0000
  - C. M. Sperberg-McQueen cmsmcq@xxxxxxxxxxxxxxxxx - 9 May 2020 13:41:36 -0000
  - Michael Kay mike@xxxxxxxxxxxx - 9 May 2020 13:55:28 -0000
  - Liam R. E. Quin liam@xxxxxxxxxxxxxxxx - 9 May 2020 17:40:08 -0000 <=
  - Dimitre Novatchev dnovatchev@xxxxxxxxx - 9 May 2020 17:52:59 -0000
  - Message not available
    - Dimitre Novatchev dnovatchev@xxxxxxxxx - 10 May 2020 18:53:34 -0000
  - Message not available
    - Message not available
      - Dimitre Novatchev dnovatchev@xxxxxxxxx - 13 May 2020 03:27:45 -0000
      - Michael Kay mike@xxxxxxxxxxxx - 13 May 2020 08:48:31 -0000

<- Previous	Index	Next ->
Re: A super-efficient way to , Michael Kay mike@xxx	Thread	Re: A super-efficient way to , Dimitre Novatchev dn
Re: A super-efficient way to , Costello, Roger L. c	Date	Re: A super-efficient way to , Dimitre Novatchev dn
	Month

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Subscribe in XML format

RSS 2.0
Atom 0.3

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >