[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Question on streaming and grouping with nested key

Subject: Re: Question on streaming and grouping with nested keys
From: "Felix Sasaki felix@xxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 14 Jul 2017 13:02:18 -0000
Re:  Question on streaming and grouping with nested key
2017-07-14 14:41 GMT+02:00 Martin Honnen martin.honnen@xxxxxx <
xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>:

> On 14.07.2017 14:05, Felix Sasaki felix@xxxxxxxxxxxxxx wrote:
>
> I tried the example from Martin with
>>
>> <xsl:template match="TRANSACTION-LIST">
>>       <xsl:copy>
>>          <xsl:for-each-group select="copy-of(TRANSACTION)"
>> group-by="ITEM2/SUBITEM2/GROUPING-KEY">
>>             <xsl:copy>
>>                <item1-sum><xsl:value-of select="sum(current-group()/IT
>> EM2/SUBITEM2.1)"/></item1-count>
>>
>> ...
>>
>> It gives me an of memory error. The input file is 160MB, but the
>> individual transactions are rather small (around 20+ elements). The error
>> also appears if I remove "<xsl:copy>".
>>
>
> 160 MB doesn't sound like a file you need streaming for at all. Does that
> suggestion above cause memory problems only when using streaming (e.g. when
> you have <xsl:mode streamable="yes"/>) or also without streaming?



Without streaming it works.



> Have you tried increasing the memory for Saxon/Java?
>


No.


>
> As you mention Saxon EE, let's hope Michael Kay comes across this thread
> and can certainly tell you more on how to tackle that problem with his
> product.
>
> I have a working solution using an accumulator and maps, see below, but
>> here I did not manage to use streaming. If I set the accumulator to
>>  streamable="yes", Saxon EE tells me
>>
>>
>> "The xsl:accumulator-rule/@select expression for a streaming accumulator
>> must be motionless"
>>
>>
>> Although I am using xsl-copy() as in Martin's example.
>>
>>
>>   <xsl:accumulator name="gather-values" as="map(xs:anyAtomicType,
>> node())" initial-value="map{}">
>>      <xsl:accumulator-rule match="TRANSACTION">
>>        <xsl:variable name="current" select="copy-of()"/>
>>
>
> As far as I understand it, you can't use copy-of() in an accumulator you
> want to be streamable. Working with streaming and accumulating values
> requires a change of the usual coding habits with XSLT, I think, for
> instance to capture the key you have with an accumulator and streaming you
> would need to use e.g.
>      <xsl:accumulator-rule match="TRANSACTION/ITEM2/SUBITEM2.2/GROUPING-KEY/text()"
> select="string()"/>
> as only on the text node you are able to read out that value while
> streaming through the document.
>
> So to try to solve that problem with accumulators and streaming I think
> you need several of them, one counting ITEM1, one summing up
> SUBITEM2.1/text(), the above for the key and then you need to combine them
> to store the data together.
>


Thanks. Working without accumulators is fine, just trying to understand the
issue. Other input files are a bit bigger, up to 1.5 GB, so having a
streaming solution would be nice but it's not mandatory.

- Felix

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.