[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Does <xsl:copy> use a lot of memory? Is there an

Subject: Re: Does <xsl:copy> use a lot of memory? Is there an alternative that is more efficient?
From: Dimitre Novatchev <dnovatchev@xxxxxxxxx>
Date: Sun, 2 Sep 2012 08:34:34 -0700
Re:  Does <xsl:copy> use a lot of memory? Is there an
In case the input document is parsed successfully, wouldn't the use of
<xsl:sequence> (instead of <xsl:copy-of>) result in using less memory?

Cheers,
Dimitre

On Sun, Sep 2, 2012 at 7:31 AM, Michael Kay <mike@xxxxxxxxxxxx> wrote:
> Memory is used for the source document and for intermediate variables. In
> Saxon, and I suspect in most processors, no memory is used for the result
> tree provided that the transformation is writing directly to a serializer.
>
> Intrinsically, all xsl:copy has to do is to send two events - startElement
> and endElement - to the serializer.
>
> I would strongly suspect that the out of memory error occurs during building
> of the source tree, and will happen whatever transformation  you run. For a
> 370Mb input document, you should probably allocate at least 2Gb of memory,
> preferably more.
>
> Michael Kay
> Saxonica
>
>
> On 02/09/2012 13:47, Costello, Roger L. wrote:
>>
>> Hi Folks,
>>
>> Does <xsl:copy> use a lot of memory?
>>
>> Is there an alternative that is more efficient?
>>
>> Consider this problem. I have an XML document in which some elements have
>> an id attribute and others have an idref attribute. If an element A
>> references element B, then I want to embed B inside A.
>>
>> Example: I want to convert this:
>>
>> <Test>
>>      <A idref="b" />
>>      <B id="b" />
>> </Test>
>>
>> to this:
>>
>> <Test>
>>      <A>
>>          <B id="b" />
>>      </A>
>>      <B id="b" />
>> </Test>
>>
>> Notice that A references B, and after processing B is nested inside A.
>>
>> Here's a template that handles elements with a reference:
>>
>>      <xsl:key name="ids" match="*[@id]" use="@id"/>
>>
>>      <xsl:template match="*[@idref]">
>>                   <xsl:variable name="refed-element" select="key('ids',
>> @idref)"/>
>>                   <xsl:copy>
>>              <xsl:copy-of select="@* except @idref" />
>>              <xsl:sequence select="$refed-element" />
>>          </xsl:copy>
>>               </xsl:template>
>>
>> The complete program is below.
>>
>> It works fine if:
>>
>> (a) The XML document is small.
>> (b) I don't have to repeat this embedding process too many times.
>>
>> However, such is not the case. I am dealing with an XML document that is
>> 370 MB in size and has tens of thousands of references. And I have to repeat
>> the embedding process multiple times.
>>
>> Saxon gives me an "out of memory error."
>>
>> I suspect the reason for this is due to the <xsl:copy> command. I believe
>> it is making new copies, thereby consuming lots of memory. True?
>>
>> So, is there an alternative to <xsl:copy> that is more efficient?
>>
>> Is there a way to express the above template rule that is more efficient?
>>
>> /Roger
>>
>> -----------------------------------------------------------------------------------------
>> <?xml version="1.0" encoding="UTF-8"?>
>> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>>                  exclude-result-prefixes="#all"
>>                  version="2.0">
>>
>>      <xsl:output method="xml" />
>>           <xsl:key name="ids" match="*[@id]" use="@id"/>
>>           <xsl:template match="*[@idref]">
>>                   <xsl:variable name="refed-element" select="key('ids',
>> @idref)"/>
>>                   <xsl:copy>
>>              <xsl:copy-of select="@* except @idref" />
>>              <xsl:sequence select="$refed-element" />
>>          </xsl:copy>
>>               </xsl:template>
>>                <xsl:template match="node()">
>>                   <xsl:copy>
>>              <xsl:copy-of select="@*"/>
>>              <xsl:apply-templates />
>>          </xsl:copy>
>>               </xsl:template>
>>
>> </xsl:stylesheet>
>



-- 
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
To avoid situations in which you might make mistakes may be the
biggest mistake of all
------------------------------------
Quality means doing it right when no one is looking.
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play
-------------------------------------
Facts do not cease to exist because they are ignored.
-------------------------------------
I finally figured out the only reason to be alive is to enjoy it.

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.