[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: How To Use Streaming To Group Elements in a Flat

Subject: Re: How To Use Streaming To Group Elements in a Flat List?
From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Tue, 2 May 2017 22:25:28 -0000
Re:  How To Use Streaming To Group Elements in a Flat
Running your code on Saxon 9.7, I get

  XTSE3430: Template rule is declared streamable but it does not satisfy the
streamability rules.
  * The xsl:for-each-group/@group-starting-with pattern is not motionless

That's because *[position()] involves counting preceding siblings. Or to look
at it another way, the pattern can't be evaluated simply by looking at the
node in isolation, it has to examine its position relative to other nodes in
the document.

But there's an easy workaround: use group-adjacent="(position() - 1) idiv
1000". With this formulation, position() is counting the items being grouped,
not the number of siblings they have.

Here's the full stylesheet:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="xs"
    version="3.0">

    <xsl:mode streamable="yes"/>

    <xsl:template match="ROWDATA">
        <xsl:variable name="resultURIbase" as="xs:string"
            select="concat('out', '/rowdata-')"
        />
        <xsl:variable name="rootname" as="xs:string" select="name(.)"/>

        <xsl:for-each-group select="ROW" group-adjacent="(position() - 1) idiv
1000">
            <xsl:result-document href="{concat($resultURIbase, generate-id(),
'.xml')}">
                <xsl:element name="{$rootname}">
                    <xsl:copy-of select="current-group()"/>
                </xsl:element>
            </xsl:result-document>
        </xsl:for-each-group>

    </xsl:template>

</xsl:stylesheet>


> On 2 May 2017, at 21:55, Eliot Kimber ekimber@xxxxxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> I have some very large (100s of MBs) XML database dump docs that I want to
break into smaller docs. This is an easy application of for-each-group or of a
simple tail recursion approach but I wanted to use this as an opportunity to
learn more about XSLT 3 streaming.
>
> Ibve read through the XSLT 3 spec and I think I generally understand the
options but itbs still not clear either how or how best to do this type of
grouping so that itbs streamable. I didnbt find any examples of this
specific use case searching on bxslt streaming with groupingb (other than
older items that donbt actually work).
>
> If my source looks like this:
>
> <ROWDATA>
>
<ROW><SRVC_CAT_ID>54</SRVC_CAT_ID><PARENT_ID>3</PARENT_ID><SRVC_CAT_NAME>Exte
rior Lights</SRVC_CAT_NAME><PARENT_NAME>Accessories and Body,
Cab</PARENT_NAME></ROW>
>
<ROW><SRVC_CAT_ID>53</SRVC_CAT_ID><PARENT_ID>3</PARENT_ID><SRVC_CAT_NAME>Exte
rior Body Panels</SRVC_CAT_NAME><PARENT_NAME>Accessories and Body,
Cab</PARENT_NAME></ROW>
>
<ROW><SRVC_CAT_ID>51</SRVC_CAT_ID><PARENT_ID>3</PARENT_ID><SRVC_CAT_NAME>Ente
rtainment Systems</SRVC_CAT_NAME><PARENT_NAME>Accessories and Body,
Cab</PARENT_NAME></ROW>
>
<ROW><SRVC_CAT_ID>40</SRVC_CAT_ID><PARENT_ID>3</PARENT_ID><SRVC_CAT_NAME>Door
Locks &amp; Anti-Theft Systems</SRVC_CAT_NAME><PARENT_NAME>Accessories and
Body, Cab</PARENT_NAME></ROW>
> b& lots more rows b&
> </ROWDATA>
>
> Ibd like to generate result files containing 1000 records each, each
wrapped in the same root element.
>
> The non-stream for-each-group is simple enough:
>
>    <xsl:template match="ROWDATA">
>        <xsl:variable name="resultURIbase" as="xs:string"
>            select="concat($outdir, '/rowdata-')"
>        />
>        <xsl:variable name="rootname" as="xs:string" select="name(.)"/>
>
>        <xsl:for-each-group select="ROW" group-starting-with="*[(position()
mod 1000) = 0]">
>            <xsl:result-document href="{concat($resultURIbase, generate-id(),
'.xml')}">
>                <xsl:element name="{$rootname}">
>                    <xsl:copy-of select="current-group()"/>
>                </xsl:element>
>            </xsl:result-document>
>        </xsl:for-each-group>
>
>    </xsl:template>
>
>
> But Ibm not seeing how do this using e.g., xsl:iterate. As is often the
case with XSLT, I feel like Ibm missing the obvious.
>
> Is it in fact possible to do what I want in a streamable way?
>
> Thanks,
>
> Eliot
>
> --
> Eliot Kimber
> http://contrext.com

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.