|
[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: A super-efficient way to compute the sum of A[i] *
I played a little bit using *saxon:threads*:
https://www.saxonica.com/html/documentation/extensions/attributes/threads.htm
l
,
with the hope that this could lead to faster execution. The results are
below, and they can be useful as a reference.
With a source XML document of 1M elements, each of which has a single text
node with a numeric value, on my PC (6 cores, Intel i7-8700 CPU @3.20GHz,
32.0GB RAM, x64-based processor and 64-bit Operating System (Windows 10)) I
got these results:
*Saxon Ver./Edition*
*saxon:threads*
*Time in seconds*
*Remarks*
Saxon-EE 9.8.0.12
1
1.8
Saxon-EE 9.8.0.12
2
15.7
???????????????
---- b -------b ----------
4
5.7
---- b -------b ----------
6
4.3
---- b -------b ----------
8
3.7
---- b -------b ----------
12
3.0
---- b -------b ----------
16
2.9
---- b -------b ----------
20
2.7
---- b -------b ----------
24
2.7
---- b -------b ----------
28
2.6
---- b -------b ----------
32
2.6
Saxon-PE 9.8.0.12
N/A
1.0
???????????????
Notice that the best result was reached when running Saxon-PE, in which the
*saxon:threads* attribute is ignored and no threading takes place.
Also note that the best result with threading is when the number of threads
is just 1.
Here is a fragment of the source XML document, just to give you an idea of
the input:
<vectors>
<seq1>
<n>0.1</n>
<n>0.2</n>
<n>0.3</n>
<n>0.4</n>
<n>0.5</n>
<n>0.6</n>
<n>0.7</n>
<n>0.8</n>
<n>0.9</n>
<n>1</n>
<n>0.1</n>
<n>0.2</n>
<n>0.3</n>
<n>0.4</n>
<n>0.5</n>
<n>0.6</n>
<n>0.7</n>
<n>0.8</n>
<n>0.9</n>
<n>1</n>
<n>0.1</n>
. . . . . . . .
The <seq1> element has 1 000 000 (1M) child elements named "n".
Then it has an immediate sibling element <seq2> with contents that was
copied from <seq1>.
Here is the transformation that uses *saxon:threads*:
<xsl:stylesheet version="3.0" xmlns:xsl="
http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:saxon="
http://saxon.sf.net/">
<xsl:output method="text"/>
<xsl:variable name="v1" select="/*/seq1/*/text()/number()"/>
<xsl:variable name="v2" select="/*/seq2/*/text()/number()"/>
<xsl:variable name="vLength" select="1000000"/>
<xsl:template match="/">
<xsl:variable name="vProds" as="xs:double*">
<xsl:for-each select="1 to $vLength" saxon:threads="6">
<xsl:sequence select="$v1[current()] * $v2[current()]"/>
</xsl:for-each>
</xsl:variable>
<xsl:value-of select="sum($vProds)"/>
</xsl:template>
</xsl:stylesheet>
Cheers,
Dimitre
On Sun, May 10, 2020 at 11:53 AM Dimitre Novatchev <dnovatchev@xxxxxxxxx>
wrote:
> For reference on fast sequential algorithms for computing the dot product,
> see this:
>
>
>
https://lemire.me/blog/2018/07/05/how-quickly-can-you-compute-the-dot-product
-between-two-large-vectors/
>
>
> On Sat, May 9, 2020 at 10:52 AM Dimitre Novatchev <dnovatchev@xxxxxxxxx>
> wrote:
>
>> Hi Roger,
>>
>> > I need a super-efficient way to compute the sum of A[i] * B[i] for
>> i=1 to n.
>>
>> In case you have n cores / processors, and the compiler knows to optimize
>> functions like map(), zipWith() (for-each-pair() in XPath 3.0), then all
>> processors can each do a corresponding multiplication in parallel in a
>> single moment (computing cycle).
>>
>> Then what remains is the summing of the results of these multiplications.
>> This essentially is the map-reduce technique.
>>
>> Interestingly enough, while not all reduce operations can be
>> parallelized, in the case of sum() one can add together n numbers in
>> Log2(n) summing steps -- very similar to the DVC (DiVide and Conquer)
>> technique, but doing it parallel -- not sequentially. So, first there
would
>> be n/2 additions, then n/4 additions (of the results of the first step),
>> then n/8 additions, and so on -- in a total of ceiling(Log2(n)) steps. If
>> for each steps we have sufficient number of processors (n/2 for the first
>> step, less for the following steps), then the summing could be performed
in
>> Log2(n) computing cycles.
>>
>> So, it seems that the whole computation could be performed in something
>> like ~ ceiling(Log2(n)) + 1 computing cycles.
>>
>> Or maybe I am being wrong here? :) Please, correct me.
>>
>> Cheers,
>> Dimitre
>>
>>
>> On Sat, May 9, 2020 at 4:59 AM Costello, Roger L. costello@xxxxxxxxx <
>> xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>>
>>> Hi Folks,
>>>
>>> I need a super-efficient way to compute the sum of A[i] * B[i] for i=1
>>> to n.
>>>
>>> For example, suppose A is this:
>>>
>>> <row>
>>> <col>0.9</col>
>>> <col>0.3</col>
>>> </row>
>>>
>>> and B is this:
>>>
>>> <row>
>>> <col>0.2</col>
>>> <col>0.8</col>
>>> </row>
>>>
>>> I want to compute:
>>>
>>> (0.9 * 0.2) + (0.3 * 0.8)
>>>
>>> Here's one way to do it:
>>>
>>> sum(for $i in 1 to count($A/col) return number($A/col[$i]) *
>>> number($B/col[$i]))
>>>
>>> I suspect that is not the most efficient approach.
>>>
>>> What is the most efficient approach? I will be doing hundreds of
>>> thousands of these computations, so I want to use the most efficient
>>> approach.
>>>
>>> /Roger
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|

Cart


![Re: A super-efficient way to compute the sum of A[i] *](/images/get_stylus.gif)





