[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Stylesheet Optimization -- How to Make It Faster

Subject: RE: Stylesheet Optimization -- How to Make It Faster
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Tue, 28 Nov 2006 09:14:04 -0000
RE:  Stylesheet Optimization -- How to Make It Faster
(a) It would be a nice courtesy if you could lay out the code so that we can
read it.

(b) What XSLT processor are you using?

(c) The most obvious inefficiency is here:
    expand="{$abbreviations[.=$abbr]/following-sibling::expanded}"
    This would benefit from use of keys.

Michael Kay
http://www.saxonica.com/


> -----Original Message-----
> From: Jeff Sese [mailto:jsese@xxxxxxxxxxxx]
> Sent: 28 November 2006 01:41
> To: Xsl-List
> Subject:  Stylesheet Optimization -- How to Make It Faster
>
> I have a stylesheet that puts mark-up to text nodes that
> matches an abbreviation in a reference xml file. Its working
> nicely but the processing time is very slow... i'm guessing
> because its processing text nodes. A 800kb file takes me
> about 25 mins to process and i have around 800 file to
> process (varying file sizes, some are relatively small and
> some are fairly large). Is there any way to optimize my
> stylesheet so that it can process the files faster?
>
> here is my stylesheet:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <xsl:stylesheet version="2.0"
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
> xmlns:xs="http://www.w3.org/2001/XMLSchema"
> xmlns:ati="http://www.asiatype.com/xslt-functions"
> exclude-result-prefixes="xs ati">
> <xsl:output method="xml" version="1.0" encoding="UTF-8"/>
> <xsl:variable name="abbreviations" as="element()+"
> select="document('publishers_data.xml')/root/publisher/abbrev"/>
> <xsl:template match="/">
> <xsl:apply-templates/>
> </xsl:template>
> <xsl:template match="text()[ancestor::ab and
> not(ancestor::note[@id and @n and @lang])]"> <xsl:variable
> name="str" as="xs:string" select="."/> <xsl:choose> <xsl:when
> test="exists($abbreviations[matches($str,concat('(^|\W)(',ati:
escape(.),')($|\W)'))])">
> <xsl:variable name="search-str" as="xs:string+"
> select="$abbreviations[matches($str,concat('(^|\W)(',ati:escap
e(.),')($|\W)'))]"/>
> <xsl:variable name="replace" as="element()*"> <xsl:for-each
> select="$search-str"> <xsl:variable name="abbr"
> as="xs:string" select="."/> <abbr type="title"
> expand="{$abbreviations[.=$abbr]/following-sibling::expanded}"
><xsl:value-of
> select="$abbr"/></abbr>
> </xsl:for-each>
> </xsl:variable>
> <xsl:sequence select="ati:replace-with-nodes($str,
> $search-str, $replace)"/> </xsl:when> <xsl:otherwise>
> <xsl:value-of select="$str"/> </xsl:otherwise> </xsl:choose>
> </xsl:template> <xsl:template
> match="@*|element()|comment()|processing-instruction()"
> mode="#all">
> <xsl:copy>
> <xsl:apply-templates select="@*|node()"/> </xsl:copy>
> </xsl:template> <xsl:function name="ati:replace-with-nodes"
> as="node()+"> <xsl:param name="input" as="xs:string"/>
> <xsl:param name="words-to-replace" as="xs:string*"/>
> <xsl:param name="replacement" as="node()*"/> <xsl:variable
> name="regex" select="string-join(for $w in $words-to-replace
> return concat('(', ati:escape($w), ')'),'|')"/>
> <xsl:analyze-string select="$input" regex="{$regex}">
> <xsl:matching-substring> <xsl:variable name="i"
> as="xs:integer" select="(1 to
> count($words-to-replace))[regex-group(.)]"/>
> <xsl:sequence select="$replacement[$i]"/>
> </xsl:matching-substring> <xsl:non-matching-substring>
> <xsl:value-of select="."/> </xsl:non-matching-substring>
> </xsl:analyze-string> </xsl:function> <xsl:function
> name="ati:escape"> <xsl:param name="s" as="xs:string"/>
> <xsl:sequence
> select="replace($s,'[\\\|\.\-\^\?\*\+\(\)\{\}\[\]\$]','\\$0')"/>
> </xsl:function>
> </xsl:stylesheet>
>
> heres a short version of the publishers_data.xml:
>
> <root>
> <publisher>
> <abbrev>Inschriften von Priene</abbrev>
> <expanded>Inschriften von Priene</expanded> </publisher>
> <publisher> <abbrev>P. Mil. Congr. XVIII</abbrev>
> <expanded>Papiri documentari dell'UniversitC Cattolica di
> Milano</expanded> </publisher> <publisher> <abbrev>P. Jud.
> Des. Misc.</abbrev> <expanded>Discoveries in the Judean
> Desert XXXVIII</expanded> </publisher>
> <!-- more publishers here -->
> </root>
>
> heres a snippet of the source xml:
>
> <!-- preceding::node() of ab -->
> <ab lang="grk" n="1">
> <foreign lang="grk">N N3N-N3N?N=N5 N:N1Oa=0 ON?a=:O
NN1ON5a=7N?O</foreign>
> <note place="margin">a c</note> <lb n="5"/> <foreign
> lang="grk">OOa=9N=N?OO ON?a?& N<N5Oa=0 NN1N<N2a=;ON7N=
N2N1ON9N;N5a=;ON1N=ON?O, a=ON5 N:N1a=6
> NN9N?N=a=;ON9N?O a<&N= a= NN9N;a=5ON9N?O</foreign> <lb/>(III),
<foreign
> lang="grk">a<Oa=6 Oa?O N>B/N5B/ a=N;ON<ON9a=1N4N?O</foreign>
(520/16)<foreign
> lang="grk">N a<1OON?ON9N?N3Oa=1ON?O. a?>NOa=9N4N?ON?O N4a=2 a=
a?>NN;N9-</foreign>
> <note place="margin">v</note> <lb/> <foreign
> lang="grk">N:N1ON=N1ON5a=:O a= ON-N;N7ON1N9 ON?a=;ON?O,
N=N5a==ON5ON?O a=$N=. N:N1a=6 a<&N=
> a<N:N?OOOa=4O N OO	ON1N3a=9ON?O</foreign> <note id="n7" n="7"
lang="ger">
> <foreign lang="grk">a=$N=N N3N-N3N?N=N5 N3a=0O N<N5Oa>=
N1a=Oa=9N=</foreign>
> A</note> <lb/> <foreign lang="grk">a= a?>NN:N1ON1a?N?O. OOa?6ON?O
N4a=2
> a<1OON?Oa=7N1N= ON5N6a?6O a<N>a=5N=N5N3N:N5, OON3N3ON1Oa=4N= N4a=2
N&N5ON5N:a=;N4N7O</foreign>
> <note id="n8b9" n="8b9" lang="ger"> <foreign
> lang="grk">OOa?6ON?ObN=N?N8N5a=;N5ON1N9</foreign> wiederholt s.
<foreign
> lang="grk">a=6OON?Oa?ON1N9</foreign>, s. <foreign
> lang="grk">OON3N3ON1ON5a?O</foreign>.</note>
> <lb/>(I 3). <foreign lang="grk">Oa=0 N3a=0O
a>=NN:N?OON9N;a=1N?O</foreign>
> (<link type="boj" targets="a002" n="BOJTEXT002_T_7">2 T
> 7</link>) <foreign lang="grk">N=N?N8N5a=;N5ON1N9.</foreign> <note
> id="n9" n="9" lang="ger"> <foreign
> lang="grk">a>=NN:N?OON9N;a=1N?O</foreign> Vossius <foreign
> lang="grk">a>=NN3N7ON9N;a=1N?O</foreign> Suid</note> </ab>
> <!-- following::node() of ab -->
>
> all: ab nodes appear in the same level (same depth) though out.
>
> Any suggestions are welcome.
>
> Thanks,
> --
> Jeff

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Cast Your Vote

We need your help – Vote for DataDirect XML Products!

  • Best SOA or XML site

Winners and finalists announced at SOA World Conference in November.

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2007 All Rights Reserved.