[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Building a Trie

Subject: Building a Trie
From: "Michael Mueller-Hillebrand michael.mueller-hillebrand@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 22 Jun 2023 06:29:12 -0000
 Building a Trie
Dear community,

Now I learned about Tries https://en.wikipedia.org/wiki/Trie which is possibly
a helpful method to tackle the left-over problem with my DemoJam contribution
at Markup UK (thanks a lot for the applause!).

Without spending too much time at the details, but at the heart of the
performance problem was this function:

<xsl:function name="my:wordMatches" as="xs:string*">
  <xsl:param name="start" as="xs:string"/>
  <xsl:sequence select="$words[starts-with(., $start)]"/>
</xsl:function>

With $words containing 25,000 entries, the performance was sufficient, but
with 160,000 entries it is just bad.

I already improved the performance by building a map, where the keys are the
first two letters of each word, and the value are all matching words.

<xsl:function name="my:wordMatches" as="xs:string*">
  <xsl:param name="start" as="xs:string"/>
  <xsl:variable name="key" select="substring($start, 1, 2)" as="xs:string"/>
  <xsl:variable name="words" select="map:get($wordsMap, $key)"
as="xs:string*"/>
  <xsl:sequence select="$words[starts-with(., $start)]"/>
</xsl:function>

Building that map takes some noticeable time, but then performance is great.
But only because I reduced the length of $words by factor 200b&300.

Before In look deeper myself, has anyone in the community already created a
Trie-like structure (map of maps?) or can share suggestions how to tackle this
in XSLT/XPath? (Because those are the hammers I plan to use for this
challenge/nail.)

Thanks,
- Michael

Michael MC<ller-Hillebrand
Senior Consultant
Phone +49 951-20859-752
Mobil +49 172-819 34 13
michael.mueller-hillebrand@xxxxxxxxx<mailto:michael.mueller-hillebrand@docufy
.de>
www.docufy.de<https://www.docufy.de/> |
DOCUFY@LinkedIN<https://www.linkedin.com/company/3845358/>
Datenschutz<https://www.docufy.de/datenschutz/>
DOCUFY GmbH | KirschC$ckerstr. 27 | 96052 Bamberg | Deutschland
CEO: Nadine Prill | Amtsgericht Bamberg HRB10571

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.