[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Alphabetical index: unstreamable?

Subject: Re: Alphabetical index: unstreamable?
From: "Michael Müller-Hillebrand mmh@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Mon, 2 Jun 2014 18:43:13 -0000
Re:  Alphabetical index: unstreamable?
Abel,

Thanks a lot for the additional hints. That will help us a lot to avoid
pitfalls. Just recently I studied your transcribed talk from Prague (Thanks to
Roger C.) and learned a lot about the streaming restrictions.

Let me put it this way: This thread convinced one of our Java developers to
stop implementing a Java solution for this XSLT problem. It looks like we
rather wait for the final XSLT3 spec.

Thanks,

- Michael

PS: Unfortunately one can not visit every cool XML conference

Am 02.06.2014 um 16:47 schrieb Abel Braaksma:

>
> On 28-5-2014 19:50, Michael M|ller-Hillebrand mmh@xxxxxxxxx wrote:
>> Hi Dimitre,
>>
>> Do I understand correctly this could be as "simple" as defining an
accumulator that incrementally builds up a map? If the source contains
<indexterm> elements I could maybe do something similar to
>>
>> <xsl:accumulator name="indexterms" as="map(xs:string, element(indexterm))"
>>    initial-value="map{}">
>>    <xsl:accumulator-rule match="indexterm"
>>      new-value=" map:put($value, generate-id(), .) "/>
>> </xsl:accumulator>
>>
>> and at the end process the content of the accumulator?
>
> Yes, that is essentially how it is supposed to be done. However, there
> are a few caveats with the code snippet above:
>
> - accumulators must be motionless, they cannot consume the current node
> - you cannot store references to nodes, here you use ".", which is not
> allowed
> - childness nodes, such as text(), can be consumed, which comes in handy
> here
> - map:put was dropped, but it seems to re-emerge, see Public XSLT Spec
> Bug 24726 (https://www.w3.org/Bugs/Public/show_bug.cgi?id=24726)
>
> To create an accumulator for indexterm elements, we need to reverse the
> match pattern, so that the focus of the accumulator is on a non-element
> leaf-node (a childless node). For simplicity, let's assume DocBook
> <indexterm> like as follows:
>
> <indexterm>
>    <primary>prim</primary>
>    <secondary>beginning</secondary>
> </indexterm>
>
> Then your accumulator could look like this:
>
> <xsl:accumulator name="indexterms"
>    as="map(xs:string, xs:string+)"
>    initial-value="map{}">
>    <xsl:accumulator-rule
>        match="text()[parent::primary |
> parent::secundary][ancestor::indexterm]"
>        new-value="map:put(
>            $value,
>            generate-id(ancestor::indexterm),
>            ($indexterms(generate-id(ancestor::indexterm)), string(.)))" />
>
> </xsl:accumulator>
>
> This matches on the text-node, and consuming the text-node is allowed
> (it will always be childless). The fn:string(.) is still required (or
> use fn:data, of fn:copy-of), because even though it is a childless node,
> you cannot store its reference in a map.
>
> The accumulator above will create a sequence of terms mapped to the
> indexterm-element, where the first term will be the <primary> element's
> content and the second in the sequence will be the <secundary>, if any.
>
> The expression inside the new-value attribute can quickly become
> unmanageable, but you can write a stylesheet function to write it
> declaratively.
>
> Note that you must be careful with fn:generate-id in a streaming
> scenario. With streaming it is likely you will have places where you use
> fn:copy-of or fn:snapshot. The id's of these nodes will be different
> from the ones on the streamed nodes of the input stream.
>
> Note also that this won't help you if you want to place the resulting
> index prior to the nodes to be processed, i.e. a TOC at the beginning of
> a document cannot be created this way.
>
> If you plan to attend the XML London 2014 conference this weekend, my
> talk will be about Streaming Design Patterns, common programming
> scenarios encountered in XSLT 2.0 and how to write them in a streamable
> way. From easy (such as matching patterns that depend on the child
> axis), to intermediate (such as working out following-sibling scenarios)
> to advanced (such as a streamable way to do sorting in a maximum of two
> passes).
>
> Cheers,
> Abel

[demime 1.01d removed an attachment of type application/pgp-signature which had a name of signature.asc]

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.