[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: A super-efficient way to compute the sum of A[i] *

Subject: Re: A super-efficient way to compute the sum of A[i] * B[i] for i=1 to n?
From: "Dimitre Novatchev dnovatchev@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 13 May 2020 03:27:45 -0000
Re:  A super-efficient way to compute the sum of A[i] *
I played a little bit using *saxon:threads*:
https://www.saxonica.com/html/documentation/extensions/attributes/threads.htm
l
,
with the hope that this could lead to faster execution. The results are
below, and they can be useful as a reference.
With a source XML document of 1M elements, each of which has a single text
node with a numeric value, on my PC (6 cores, Intel i7-8700 CPU @3.20GHz,
32.0GB RAM, x64-based processor and 64-bit Operating System (Windows 10)) I
got these results:

*Saxon Ver./Edition*

*saxon:threads*

*Time in seconds*

*Remarks*

Saxon-EE 9.8.0.12

 1

1.8



Saxon-EE 9.8.0.12

 2

15.7

???????????????

----  b -------b ----------

 4

5.7



----  b -------b ----------

 6

4.3



----  b -------b ----------

 8

3.7



----  b -------b ----------

12

3.0



----  b -------b ----------

16

2.9



----  b -------b ----------

20

2.7



----  b -------b ----------

24

2.7



----  b -------b ----------

28

2.6



----  b -------b ----------

32

2.6



Saxon-PE 9.8.0.12

N/A

1.0

???????????????



Notice that the best result was reached when running Saxon-PE, in which the
*saxon:threads* attribute is ignored and no threading takes place.


Also note that the best result with threading is when the number of threads
is just 1.


Here is a fragment of the source XML document, just to give you an idea of
the input:

<vectors>
<seq1>
 <n>0.1</n>
 <n>0.2</n>
 <n>0.3</n>
 <n>0.4</n>
 <n>0.5</n>
 <n>0.6</n>
 <n>0.7</n>
 <n>0.8</n>
 <n>0.9</n>
 <n>1</n>
 <n>0.1</n>
 <n>0.2</n>
 <n>0.3</n>
 <n>0.4</n>
 <n>0.5</n>
 <n>0.6</n>
 <n>0.7</n>
 <n>0.8</n>
 <n>0.9</n>
 <n>1</n>
 <n>0.1</n>

.  .  .  .  .  .  .  .

The <seq1> element has 1 000 000 (1M) child elements named "n".


Then it has an immediate sibling element <seq2> with contents that was
copied from <seq1>.


Here is the transformation that uses *saxon:threads*:

<xsl:stylesheet version="3.0" xmlns:xsl="
http://www.w3.org/1999/XSL/Transform"
 xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:saxon="
http://saxon.sf.net/">
 <xsl:output method="text"/>
 <xsl:variable name="v1" select="/*/seq1/*/text()/number()"/>
 <xsl:variable name="v2" select="/*/seq2/*/text()/number()"/>
 <xsl:variable name="vLength" select="1000000"/>

  <xsl:template match="/">
    <xsl:variable name="vProds" as="xs:double*">
   <xsl:for-each select="1 to $vLength" saxon:threads="6">
     <xsl:sequence select="$v1[current()] * $v2[current()]"/>
   </xsl:for-each>
    </xsl:variable>

    <xsl:value-of select="sum($vProds)"/>
  </xsl:template>
</xsl:stylesheet>



Cheers,

Dimitre

On Sun, May 10, 2020 at 11:53 AM Dimitre Novatchev <dnovatchev@xxxxxxxxx>
wrote:

> For reference on fast sequential algorithms for computing the dot product,
> see this:
>
>
>
https://lemire.me/blog/2018/07/05/how-quickly-can-you-compute-the-dot-product
-between-two-large-vectors/
>
>
> On Sat, May 9, 2020 at 10:52 AM Dimitre Novatchev <dnovatchev@xxxxxxxxx>
> wrote:
>
>> Hi Roger,
>>
>>   > I need a super-efficient way to compute the sum of A[i] * B[i] for
>> i=1 to n.
>>
>> In case you have n cores / processors, and the compiler knows to optimize
>> functions like map(), zipWith() (for-each-pair() in XPath 3.0), then all
>> processors can each do a corresponding multiplication in parallel in a
>> single moment (computing cycle).
>>
>> Then what remains is the summing of the results of these multiplications.
>> This essentially is the map-reduce technique.
>>
>> Interestingly enough, while not all reduce operations can be
>> parallelized, in the case of sum() one can add together n numbers in
>> Log2(n) summing steps -- very similar to the DVC (DiVide and Conquer)
>> technique, but doing it parallel -- not sequentially. So, first there
would
>> be n/2 additions, then n/4 additions (of the results of the first step),
>> then n/8 additions, and so on -- in a total of ceiling(Log2(n)) steps. If
>> for each steps we have sufficient number of processors (n/2 for the first
>> step, less for the following steps), then the summing could be performed
in
>> Log2(n) computing cycles.
>>
>> So, it seems that the whole computation could be performed in something
>> like ~ ceiling(Log2(n)) + 1 computing cycles.
>>
>> Or maybe I am being wrong here? :)  Please, correct me.
>>
>> Cheers,
>> Dimitre
>>
>>
>> On Sat, May 9, 2020 at 4:59 AM Costello, Roger L. costello@xxxxxxxxx <
>> xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>>
>>> Hi Folks,
>>>
>>> I need a super-efficient way to compute the sum of A[i] * B[i] for i=1
>>> to n.
>>>
>>> For example, suppose A is this:
>>>
>>> <row>
>>>     <col>0.9</col>
>>>     <col>0.3</col>
>>> </row>
>>>
>>> and B is this:
>>>
>>> <row>
>>>     <col>0.2</col>
>>>     <col>0.8</col>
>>> </row>
>>>
>>> I want to compute:
>>>
>>> (0.9 * 0.2) + (0.3 * 0.8)
>>>
>>> Here's one way to do it:
>>>
>>> sum(for $i in 1 to count($A/col) return number($A/col[$i]) *
>>> number($B/col[$i]))
>>>
>>> I suspect that is not the most efficient approach.
>>>
>>> What is the most efficient approach? I will be doing hundreds of
>>> thousands of these computations, so I want to use the most efficient
>>> approach.
>>>
>>> /Roger

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.