XML Editor
Sign up for a WebBoard account Sign Up Keyword Search Search More Options... Options
Chat Rooms Chat Help Help News News Log in to WebBoard Log in Not Logged in
Show tree view Topic
Topic Page 1 2 3 4 5 6 7 8 9 Go to previous topicPrev TopicGo to next topicNext Topic
Postnext
Metin SolmazSubject: Memory utilization high with multi pass
Author: Metin Solmaz
Date: 07 Jul 2009 06:22 AM
Originally Posted: 07 Jul 2009 06:14 AM
Hi,

I have a memory problem with a stylesheet where I use the multi-pass (or pipeline) pattern. With relatively small input files, the memory utilization is not noticed. However, with bigger input files (e.g., > 6 Mb), both in xsltproc (libxslt 10118) and in saxon (6.5) the memory utilization becomes excessive (more than 1 Gb).

There are in total 13 passes (yes, I agree it seems a lot, but it was really necessary for the problem at hand). The first phase extends the input XML with some additional information (attributes). The second phase uses the output of the first phase and extends it with even more data and so on. The output of each phase is held in a variable.

A sketch of the multi-pass process:
<!-- start -->
<xsl:variable name="phase1">
<xsl:apply-templates mode="phase1" select="/"/>
</xsl:variable>

<xsl:variable name="phase2">
<xsl:apply-templates select="exsl:node-set($phase1)"
mode="phase2"/>
</xsl:variable>

<!-- and so on ... -->

<xsl:variable name="phaseN">
<xsl:apply-templates select="exsl:node-set($phaseN-1)"
mode="phaseN"/>
</xsl:variable>
<!-- end -->

When we look closer at the process, we will notice that each variable (phase) is referenced *only once*, namely in the following phase. Thus, in theory, the variable could be cleaned up (removed from memory) once it is not needed anymore. However, assuming this is not trivial to determine (it *is* in this particular case, but probably not in general), the xslt processor (at least xltproc and saxon) holds all variables in memory until they are out of scope (at the end of the template or for-each or...).

I tried an alternative, via exsl:document, with which each phase is stored in a file and loaded by the next phase via document(). But this does not solve the issue.

I guess, currently, the only real solution is to perform the pipeline outside the template (or use saxon:next-in-chain, but in our end-solution we must use xltproc (libxslt) as that is what we have integrated with). However, I think, especially with the multi-pass pattern, xslt processors could be more sophisticated and efficient with respect to memory utilization by cleaning up memory of variables that are (will not be) referenced anymore.

Regards,
Metin

Posttop
Metin SolmazSubject: Memory utilization high with multi pass
Author: Metin Solmaz
Date: 07 Jul 2009 12:10 PM
Hi again,

After further thinking about this, I think I found the solution within the template. Instead of storing the intermediate steps in variables and passing it on to the next step, I passed the output of step N to step N+1 as a parameter and so on (nested). This did indeed reduced the memory utilization to normal levels.

So we get:
<!-- start -->

<xsl:call-template name="phaseN">
<xsl:with-param name="phaseN-1">
<xsl:call-template name="phaseN-1">
<xsl:with-param name="phaseN-2">
....
<xsl:call-template name="phase2">
<xsl:with-param name="phase1">
<xsl:apply-templates select="/" mode="phase1"/>
</xsl:with-param>
</xsl:call-template>
....
</xsl:with-param>
</xsl:call-template>
</xsl:with-param>
</xsl:call-template


<xsl:template name="phase2">
<xsl:param name="phase1"/>
<xsl:apply-templates select="exsl:node-set($phase1)" mode="phase2"/>
</xsl:template>

<xsl:template name="phaseN">
<xsl:param name="phaseN-1"/>
<xsl:apply-templates select="exsl:node-set($phaseN-1)" mode="phaseN"/>
</xsl:template>

<!-- end -->

Regards,
Stoic

 
Topic Page 1 2 3 4 5 6 7 8 9 Go to previous topicPrev TopicGo to next topicNext Topic
Download A Free Trial of Stylus Studio 6 XML Professional Edition Today! Powered by Stylus Studio, the world's leading XML IDE for XML, XSLT, XQuery, XML Schema, DTD, XPath, WSDL, XHTML, SQL/XML, and XML Mapping!  
go

Log In Options

Site Map | Privacy Policy | Terms of Use | Trademarks
Stylus Scoop XML Newsletter:
W3C Member
Stylus Studio® and DataDirect XQuery ™are from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2016 All Rights Reserved.