[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Building a Trie

Subject: Re: Building a Trie
From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 22 Jun 2023 08:12:33 -0000
Re:  Building a Trie
Incidentally, in SaxonJ the default data structure for a map is in fact a
HashTrie, implemented using Michael Froh's ImmutableMap library. But a
HashTrie gives you sorted access by hash code, not by the actual key value. An
xsl:key index defined using saxon:range-key="yes" is implemented as a Java
TreeMap, which gives you sorted access using the key value. A TreeMap is
apparently implemented as a red-black tree, not as a Trie, but we don't really
need to know that.

Michael Kay

> On 22 Jun 2023, at 08:44, Michael Kay mike@xxxxxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
> Saxon (-PE and higher) has extensions that might help.
> First, on xsl:key you can declare saxon:range-key="yes" (which for some
reason is not mentioned at
but only at
; the documentation also fails to mention that it doesn't work on SaxonCS.)
> You can then view any part of the key as a sorted map using saxon:key-map(),
and within that sorted map, keys() delivers the keys in sorted order.
> It's a bit convoluted, but to find words starting with "PQR" you could
define a key-map with a range from "PQR" to "PQR&#x10FFFF;"; and then use
keys() on that map. (Note that key-map doesn't build a new index, it just
creates a view of a subset of the existing index).
> Having got the data structure, I'm sure I could find a better way of
exploiting it....
> Meanwhile I'm going to add an issue on the qt4 list to add sorted maps.
> Michael Kay
> Saxonica
>> On 22 Jun 2023, at 07:29, Michael Mueller-Hillebrand
<mailto:xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>> wrote:
>> Dear community,
>> Now I learned about Tries https://en.wikipedia.org/wiki/Trie
<https://en.wikipedia.org/wiki/Trie> which is possibly a helpful method to
tackle the left-over problem with my DemoJam contribution at Markup UK (thanks
a lot for the applause!).
>> Without spending too much time at the details, but at the heart of the
performance problem was this function:
>> <xsl:function name="my:wordMatches" as="xs:string*">
>>   <xsl:param name="start" as="xs:string"/>
>>   <xsl:sequence select="$words[starts-with(., $start)]"/>
>> </xsl:function>
>> With $words containing 25,000 entries, the performance was sufficient, but
with 160,000 entries it is just bad.
>> I already improved the performance by building a map, where the keys are
the first two letters of each word, and the value are all matching words.
>> <xsl:function name="my:wordMatches" as="xs:string*">
>>   <xsl:param name="start" as="xs:string"/>
>>   <xsl:variable name="key" select="substring($start, 1, 2)"
>>   <xsl:variable name="words" select="map:get($wordsMap, $key)"
>>   <xsl:sequence select="$words[starts-with(., $start)]"/>
>> </xsl:function>
>> Building that map takes some noticeable time, but then performance is
great. But only because I reduced the length of $words by factor 200b&300.
>> Before In look deeper myself, has anyone in the community already created a
Trie-like structure (map of maps?) or can share suggestions how to tackle this
in XSLT/XPath? (Because those are the hammers I plan to use for this
>> Thanks,
>> - Michael
>> Michael MC<ller-Hillebrand
>> Senior Consultant
>> Phone +49 951-20859-752
>> Mobil +49 172-819 34 13
>> michael.mueller-hillebrand@xxxxxxxxx
>> www.docufy.de <https://www.docufy.de/> | DOCUFY@LinkedIN
>> Datenschutz <https://www.docufy.de/datenschutz/>
>> DOCUFY GmbH | KirschC$ckerstr. 27 | 96052 Bamberg | Deutschland
>> CEO: Nadine Prill | Amtsgericht Bamberg HRB10571
>> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
>> EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/293509> (by
email <applewebdata://7ECE1E74-C159-42BA-9130-783318C4DB87>)
> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
> EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/3500899> (by
email <>)

Current Thread


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.