[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Processing Efficiently

Subject: Re: Processing Efficiently
From: Karl Stubsjoen <kstubs@xxxxxxxxx>
Date: Fri, 10 Jun 2005 08:43:26 -0700
Re:  Processing Efficiently
What is the cost for loading up a variable with a large XML source?  So:

<xsl:variable name="my_variable"
select="document('my_very_large_source.xml')"/>

Where 'my_very_large_source.xml' is 25MB +

There was a noticeable (improved) difference in processing but was
that because A) loading up the smaller condensed xml source was just
that much easier for the processor B) queries against a condensed xml
source is quicker.

> I haven't looked at this in detail, but I think you can almost certainly
> solve your performance problems using keys. Look for constructs like
I will try this.
What about variable definitions that might be a pointer at a section
of your xml source that you refer to often, would this improve
perfromance too?


On 6/10/05, Michael Kay <mike@xxxxxxxxxxxx> wrote:
> I haven't looked at this in detail, but I think you can almost certainly
> solve your performance problems using keys. Look for constructs like
> //thing[property=value] and replace them with calls on the key() function.
>
> Michael Kay
> http://www.saxonica.com/
>
> > -----Original Message-----
> > From: Karl Stubsjoen [mailto:kstubs@xxxxxxxxx]
> > Sent: 08 June 2005 20:34
> > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> > Subject: Re:  Processing Efficiently
> >
> > I had to all ready reduce the size of the XML quite a bit by sheer
> > element renaming and elination of unused elements.  $s use to be 25MB,
> > but by eliminating unused elements (really needed 2) and by renaming
> > "xlsRow" to "R" and "xlsColumn" to "C" and by renaming the attribute
> > "column" to "c" I was able to reduce the size by 1/3.
> >
> > The thing is this:  $s is my master doc, contains the lookup records.
> > I have many individual docs that will be compared agains $s, and these
> > files range in size from 20KB to 5MB (appx.).  I don't mind a
> > different approach (for example reducing $s source).  I'm just curious
> > how others would approach something like this.  How would you arrange
> > such documentation for this sort of processing?
> >
> > The scenario is:
> > Large data file for lookups / validation (10 to 20MB)
> > Individual data files (up to 5MB)
> > As individual data files refresh, identify those items that exist in
> > the master list.  Again, this is a topic of "Performance" and "Best
> > Practice" for peforming frequent validations of documents this size.
> >
> >
> >
> > On 6/8/05, tomas.vanek@xxxxxxxxxxxxx
> > <tomas.vanek@xxxxxxxxxxxxx> wrote:
> > > using keys could help to speed up the transformation (here
> > is just the
> > > idea):
> > >
> > > ...
> > >        <xsl:key name="summaryInvoice"
> > > use="document('summary.xml')//xls/R" match="C[@c='I']"/>
> > >
> > > ...
> > >        <xsl:template match="xlsRow">
> > >                <xsl:variable name="current_invoice"
> > > select="xlsColumn[@column='Invoice_#']"/>
> > >                <xsl:variable name="current_balance"
> > > select="key('summaryInvoice', $current_invoice)/C[@c='B']"/>
> > >                <xsl:variable name="diff_balance"
> > > select="$current_balance - xlsColumn[@column='Balance']"/>
> > > ...
> > >
> > > tomi
> > >
> > >
> > > -----Original Message-----
> > > From: Karl Stubsjoen [mailto:kstubs@xxxxxxxxx]
> > > Sent: Wednesday, June 08, 2005 10:08 AM
> > > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> > > Subject:  Processing Efficiently
> > >
> > > Hello,
> > > I would like to optimize the following:
> > >
> > > Where $s is a 5MB document and the source document is app 2-5MB.
> > > The goal:  copy everything in the source that exists in $s.
> > > Catch:  need to know the value of the balance in $s.
> > >
> > > $s looks like:
> > > <xls>
> > > <R row="2">
> > >  <C c="I">2AA9379</C><!-- match value "invoice" -->
> > >  <C c="B">-127.5</C><!-- this is the balance --> </R> ...
> > > </xls>
> > >
> > > <xsl:stylesheet version="1.0"
> > > xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
> > > <xsl:output method="xml" indent="yes" encoding="utf-8"/>
> > >
> > > <xsl:variable name="s"
> > > select="document('summarydata/summaryreduced.xml')//xls/R"/>
> > >
> > > <xsl:template match="/">
> > > <result>
> > > <xsl:apply-templates
> > > select="xls/xlsRow[xlsColumn[@column='Invoice_#']=$s/C[@c='I'] |
> > > xlsColumn[@column='Balance'][not(.= $s/C[@c='B'])]]"/> </result>
> > > </xsl:template>
> > >
> > > <xsl:template match="xlsRow">
> > > <xsl:variable name="current_invoice"
> > > select="xlsColumn[@column='Invoice_#']"/>
> > > <xsl:variable name="current_balance"
> > > select="$s[C[@c='I']=$current_invoice]/C[@c'B']"/>
> > > <xsl:variable name="diff_balance" select="$current_balance -
> > > xlsColumn[@column='Balance']"/> <xsl:copy> <xsl:apply-templates
> > > select="@*"/> <xsl:attribute name="current_balance"><xsl:value-of
> > > select="$current_balance"/></xsl:attribute>
> > > <xsl:attribute name="diff_balance"><xsl:value-of
> > > select="$diff_balance"/></xsl:attribute>
> > >  <xsl:apply-templates select="xlsColumn"/> </xsl:copy>
> > </xsl:template>
> > >
> > > <xsl:template match="@*">
> > > <xsl:copy>
> > >  <xsl:apply-templates select="@*"/>
> > > </xsl:copy>
> > > </xsl:template>
> > >
> > > <xsl:template match="xlsColumn">
> > > <xsl:copy-of select="."/>
> > > </xsl:template>
> > >
> > > </xsl:stylesheet>
> > >
> > >
> > >
> > > This message is for the designated recipient only and may
> > contain privileged, proprietary, or otherwise private
> > information.  If you have received it in error, please notify
> > the sender immediately and delete the original.  Any other
> > use of the email by you is prohibited.

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.