[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Grouping elements that have at least one common va

Subject: Re: Grouping elements that have at least one common value
From: "Matthieu Ricaud-Dussarget ricaudm@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Mon, 26 Jun 2023 16:16:49 -0000
Re:  Grouping elements that have at least one common va
Hi Michael,

Thanks for your feedback.
Yes true the xsl:break on the iteration over 100 000 000 make it ungreedy,
just enough iteration, no more.
I add <xsl:assert test=". lt 1000"> it's never launched. I also add an
xsl:message to see the value which is no high at all (1, 2, ... less than
10)

About the except I use a sequence of generated-ids to excludes
already processed nodes :

<xsl:template name="els:process">
    <xsl:param name="GRCHOIX" as="element()*"/>
    <xsl:param name="processed-GRCHOIX.ids" select="()" as="xs:string*"/>
    <xsl:if test="not(empty($GRCHOIX))">
      <xsl:variable name="start-node" select="$GRCHOIX[1]"
as="element(GRCHOIX)?"/>
      <xsl:variable name="start-node.transitive-closure"
as="element(GRCHOIX)*"
        select="els:transitive-closure($start-node)"/>
      <GROUP>
        <xsl:sequence select="$start-node.transitive-closure"/>
      </GROUP>
      <xsl:call-template name="els:process">
        <!--<xsl:with-param name="GRCHOIX" select="$GRCHOIX except
$start-node.transitive-closure"/>-->
        <xsl:with-param name="GRCHOIX" select="$GRCHOIX[not(generate-id() =
$processed-GRCHOIX.ids)]"/>
        <xsl:with-param name="processed-GRCHOIX.ids"
          select="($processed-GRCHOIX.ids,
$start-node.transitive-closure/generate-id())"/>
      </xsl:call-template>
    </xsl:if>
  </xsl:template>

It looks like it didn't really change the perf.

There's another "except" within the "els:transitive-closure"
: <xsl:variable name="next-nodes" select="($origin ! els:step(.)) except
$result"/>
Maybe this one is greedy too ?

So I applied the same method, but it didn't go faster neither :

<xsl:function name="els:transitive-closure" as="node()*">
    <xsl:param name="start-node" as="node()"/>
    <xsl:iterate select="1 to 100000000">
      <xsl:param name="result" as="node()*" select="()"/>
      <xsl:param name="origin" as="node()*" select="$start-node"/>
      <xsl:param name="result.ids" as="xs:string*" select="()"/>
      <xsl:variable name="next-nodes" select="($origin !
els:step(.))[not(generate-id() = $result.ids)]"/>
      <xsl:assert test=". lt 1000"/>
      <xsl:choose>
        <xsl:when test="empty($next-nodes)">
          <xsl:sequence select="$result"/>
          <xsl:break/>
        </xsl:when>
        <xsl:otherwise>
          <xsl:variable name="result.new" select="$result | $next-nodes"/>
          <xsl:next-iteration>
            <xsl:with-param name="result" select="$result.new"/>
            <xsl:with-param name="origin" select="$next-nodes"/>
            <xsl:with-param name="result.ids"
select="$result.new/generate-id()"/>
          </xsl:next-iteration>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:iterate>
  </xsl:function>

I didn't try the Saxon profiling nor the Oxygen debugger with hotspot (as
long as I didn't go to the end of the transformation for now) on the big
file.
But activating Oxygen debugger on the small sample of this mail give 2
hotspots :
- call to els:transitive-closure($start-node)
- and key('getGrchoixbyChoixCode', $start/CHOIX/@CODE, $root)

I also add xsl:message everywhere and it confirms the call to
els:transitive-closure is qui greedy.
Maybe this expression : "$origin ! els:step(.)" ?
I guess it's ok about memory but I have 37 000 GRCHOIX in my input and
after about 5min it looks like 1000 have been processed.
It's not linear, it shoud be more and more short.
I'll launch the transformation fully to the end tonight to see how long it
is.

Cheers
Matthieu RICAUD-DUSSARGET



-- 
Matthieu Ricaud-Dussarget
+33 6.63.25.95.58

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.