[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: improving performance in creating ids

Subject: Re: improving performance in creating ids
From: "Wendell Piez wapiez@xxxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 24 Apr 2019 21:29:14 -0000
Re:  improving performance in creating ids
Pieter,

That is excellent.

However, I haven't given up yet on xsl:number/@from -- not saying I'll
explain it or make it work, but unless I miss something (not
impossible), it *should* work the way we want and if it doesn't, there
must be something about it, or the problem, we aren't seeing. (Or a
bug in the processor?)

After all, a use case such as you have described is what this syntax
is clearly meant to address.

The news that a counting-based solution is not much better with a key,
than without it, is interesting, but possibly due to Saxon
optimizations (processor?) ... which suggests that some processors
might *really* take their sweet time with a raw XPath counting-based
solution....

Cheers, Wendell






On Wed, Apr 24, 2019 at 10:23 AM Pieter Lamers
pieter.lamers@xxxxxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
wrote:
>
> Hi all,
>
> In the end I found the solution for my original numbering plan in this
> xsl:number expression:
>
> <xsl:number level="any" count="*[. &gt;&gt; $ancestor-with-id][@rid]"/>
>
> the '>>' operator performs well enough (total processing time for the
> test book now 5 seconds) and was brought to my kind attention by Erik
> Siegel. Thanks for all your help.
>
> Best,
> Pieter
>
> On 24/04/2019 07:46, Pieter Lamers pieter.lamers@xxxxxxxxxxxx wrote:
> > Hi Wendell,
> >
> > Had not seen your subsequent replies before I signed off last night.
> > Your solution below involves a count which brings back my original
> > performance problem. I think I will change my requirement for
> > "locally" numbered ids somewhat so I can profit most from xsl:number.
> > still, sad that 'from' cannot serve my purpose (or so it seems).
> >
> > Hi Liam,
> >
> > You are probably right that indexing + keys should work in the xquery
> > solution. I'd have to dive a little further into that area before I
> > can put it to use; my initial efforts did not make a change.
> >
> > Thanks and all the best,
> > Pieter
> >
> > On 23/04/2019 23:47, Wendell Piez wapiez@xxxxxxxxxxxxxxx wrote:
> >> Okay this is my next shot --
> >>
> >> <xsl:value-of select="ancestor::*[exists(@id)][1]/@id || '-' ||
> >> local-name() ||
> >> count(
> >> key('elems-by-name',local-name(),ancestor::*[exists(@id)][1])[current()
> >>>> .] ) + 1"/>
> >> but after having done that I'd probably go back to xsl:number.
> >>
> >> Partly since it's probably as fast, but mainly because declarative
> >> syntax rules.
> >>
> >> (Note: still untested. Use at your own risk!)
> >>
> >> Cheers, Wendell
> >>
> >>
> >> On Tue, Apr 23, 2019 at 5:40 PM Wendell Piez wapiez@xxxxxxxxxxxxxxx
> >> <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
> >>> Oops, hit button too soon -- you'll see the error there.
> >>>
> >>> I leave scoping the correct count as an exercise, but it's in there
> >>> somewhere! :-)
> >>>
> >>> Cheers, Wendell
> >>>
> >>> On Tue, Apr 23, 2019 at 5:39 PM Wendell Piez
> >>> <wapiez@xxxxxxxxxxxxxxx> wrote:
> >>>> Hi again,
> >>>>
> >>>> Also note if we had a key we would need no variable --
> >>>>
> >>>> <xsl:value-of select="local-name() || '-'"/>
> >>>> <xsl:number level="any" from="*[@id]"
> >>>> count="key('elem-by-name',local-name())"/>
> >>>>
> >>>> which suggests we could also use the third argument of key() ...
> >>>>
> >>>> <xsl:value-of select="local-name() || '-' ||
> >>>> count(key('elems-by-name',local-name(),ancestor::*[exists(@id)][1]))"/>
> >>>>
> >>>>
> >>>> still not tested -- but ought to work, syntax errors aside --
> >>>>
> >>>> Cheers, Wendell
> >>>>
> >>>> On Tue, Apr 23, 2019 at 5:31 PM Wendell Piez
> >>>> <wapiez@xxxxxxxxxxxxxxx> wrote:
> >>>>> Hey Pieter,
> >>>>>
> >>>>> If performance were the issue, I might try factoring out the ID
> >>>>> labeling into a completely separate pass, in order (for example) to
> >>>>> implement it as a sibling traversal, passing parameters forward to
> >>>>> increment the ID values. (If your numbering is fancy, for example
> >>>>> scoping the increment to the element type as well as the ancestor,
> >>>>> you
> >>>>> might have to pass a map forward.) I think that ought to be pretty
> >>>>> fast, plus it separates this logic from the other logic of the XSLT.
> >>>>> It's essentially like treating the XSLT engine like an overpowered
> >>>>> SAX
> >>>>> parser. (Not that I would know how to make one of those.)
> >>>>>
> >>>>> But this is only if xsl:number wasn't doing it, after I tried
> >>>>> something like what Martin H shows with plain old templates.
> >>>>>
> >>>>> <xsl:variable name="ilk" select="local-name()"/>
> >>>>> <xsl:value-of select="$ilk || '-'"/>
> >>>>> <xsl:number level="any" from="*[@id]" count="*[local-name() eq
> >>>>> $ilk]"/>
> >>>>>
> >>>>> -- untested --
> >>>>>
> >>>>> Cheers, Wendell
> >>>>>
> >>>>> On Tue, Apr 23, 2019 at 10:57 AM Martin Honnen martin.honnen@xxxxxx
> >>>>> <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
> >>>>>> On 23.04.2019 16:28, Pieter Lamers pieter.lamers@xxxxxxxxxxxx wrote:
> >>>>>>
> >>>>>>> Thanks for your quick reply. the node identity comparison helped
> >>>>>>> quite a
> >>>>>>> bit, although I am still around a minute for a full book of ids.
> >>>>>>> I am
> >>>>>>> not sure how xsl:number would help here, and what kind of
> >>>>>>> performance
> >>>>>>> win it would give over count(). I tried something with a nested
> >>>>>>> transformation, but what should I feed it?
> >>>>>>>
> >>>>>>>       <xsl:number select="*[last()]"/>
> >>>>>>> works (given a set of preceding nodes) but it is slightly slower
> >>>>>>> than a
> >>>>>>> count() in the xquery. Maybe I should be using xsl:number
> >>>>>>> differently?
> >>>>>>
> >>>>>> It is difficult for me to suggest that without knowing the XML input
> >>>>>> structure and whether you want to generate that id based on a
> >>>>>> count or
> >>>>>> numbering only for certain nodes or some particular element type. In
> >>>>>> general if I wanted to delegate counting to xsl:number similar to
> >>>>>> your
> >>>>>> function I would define a template in a mode for that e.g.
> >>>>>>
> >>>>>>      <xsl:template match="*" mode="number">
> >>>>>>         <xsl:number level="any" from="*[@id]"/>
> >>>>>>      </xsl:template>
> >>>>>>
> >>>>>> and then, where you need that number, you would use e.g.
> >>>>>>
> >>>>>>      <xsl:apply-templates select="." mode="number"/>
> >>>>>>
> >>>>>> Both the template or the or the select of the apply-templates can of
> >>>>>> course be adapted to more particular needs.
> >>>>>>
> >>>>>> As for being more efficient that using count, that then depends
> >>>>>> on the
> >>>>>> implementation but I would think there is some optimization to be
> >>>>>> expected in an XSLT processor for xsl:number.
> >>>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> ...Wendell Piez... ...wendell -at- nist -dot- gov...
> >>>>> ...wendellpiez.com... ...pellucidliterature.org...
> >>>>> ...pausepress.org...
> >>>>> ...github.com/wendellpiez... ...gitlab.coko.foundation/wendell...
> >>>>
> >>>>
> >>>> --
> >>>> ...Wendell Piez... ...wendell -at- nist -dot- gov...
> >>>> ...wendellpiez.com... ...pellucidliterature.org...
> >>>> ...pausepress.org...
> >>>> ...github.com/wendellpiez... ...gitlab.coko.foundation/wendell...
> >>>
> >>>
> >>> --
> >>> ...Wendell Piez... ...wendell -at- nist -dot- gov...
> >>> ...wendellpiez.com... ...pellucidliterature.org... ...pausepress.org...
> >>> ...github.com/wendellpiez... ...gitlab.coko.foundation/wendell...
> >>>
> >>
> >>
> --
> Pieter Lamers
> John Benjamins Publishing Company
> Postal Address: P.O. Box 36224, 1020 ME AMSTERDAM, The Netherlands
> Visiting Address: Klaprozenweg 75G, 1033 NN AMSTERDAM, The Netherlands
> Warehouse: Kelvinstraat 11-13, 1446 TK PURMEREND, The Netherlands
> tel: +31 20 630 4747
> web: www.benjamins.com
> 



-- 
...Wendell Piez... ...wendell -at- nist -dot- gov...
...wendellpiez.com... ...pellucidliterature.org... ...pausepress.org...
...github.com/wendellpiez... ...gitlab.coko.foundation/wendell...

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.