[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: To determine the distinct elements in a sequence o

Subject: Re: To determine the distinct elements in a sequence of 46,656 elements takes 5 hours of XSLT processing
From: "Imsieke, Gerrit, le-tex" <gerrit.imsieke@xxxxxxxxx>
Date: Mon, 20 Aug 2012 01:03:12 +0200
Re:  To determine the distinct elements in a sequence o
You could try the following:

In a first pass, calculate a hash value (string) for each map.

Canonicalize the hash calculation by sorting the singletonMaps for the from and to values. A hash for the given map may be:
A320_SFO__B707_ORD__F16_DFW__F35_MIA__MD90_DEN__S340_LAX


Store the hash-enhanced maps in a variable of type document-node(element(maps)) so that you will be able to use this document node as a third argument to the key() function.

Define a key that returns a map for its hash value.

Apply distinct-values to the sequence of hashes and, for each distinct hash, return the first item that the key lookup returns for this hash value.

Of course you could also group-by the hash values and return each group's context item (= first item). You don't need to passes then.

Gerrit



On 2012-08-20 00:43, Costello, Roger L. wrote:
Hi Folks,

I have a sequence of 46,656 elements that I call "maps."

Here is one map:

     <map id="Planes-Enroute-to-Airports">
         <singletonMap from="F16" to="DFW"/>
         <singletonMap from="B707" to="ORD"/>
         <singletonMap from="F35" to="MIA"/>
         <singletonMap from="S340" to="LAX"/>
         <singletonMap from="A320" to="SFO"/>
         <singletonMap from="MD90" to="DEN"/>
     </map>

I wrote a function to return all of the distinct maps.

Unfortunately it takes about 5 hours of XSLT processing.

Perhaps my XSLT program is inefficient.

I am hoping that you can show me a more efficient program or identify where my program is inefficient.

I am using XSLT 2.0.

Here is my function to return all distinct maps:

     <xsl:function name="ct:distinct" as="element(map)*">
         <xsl:param name="maps" />

         <xsl:variable name="new-maps">
             <maps>
                 <xsl:sequence select="$maps" />
             </maps>
         </xsl:variable>
         <xsl:for-each select="$new-maps/maps/map">
                 <xsl:if test="not(ct:contained-within(., ./following-sibling::map))"><xsl:sequence select="." /></xsl:if>
         </xsl:for-each>

</xsl:function>

The following function determines if a map is contained within a sequence of maps; it uses a binary divide-and-conquer approach:

     <xsl:function name="ct:contained-within" as="xs:boolean">
         <xsl:param name="map" as="element(map)"/>
         <xsl:param name="maps" as="element(map)*"/>

<xsl:variable name="cnt" select="count($maps)" />

         <xsl:choose>
             <xsl:when test="$cnt eq 0"><xsl:value-of select="false()" /></xsl:when>
             <xsl:when test="ct:equal($map, $maps[1])"><xsl:value-of select="true()" /></xsl:when>
             <xsl:when test="$cnt eq 1"><xsl:value-of select="false()" /></xsl:when>
             <xsl:otherwise>
                 <xsl:variable name="half" select="$cnt idiv 2" />
                 <xsl:choose>
                     <xsl:when test="ct:contained-within($map, $maps[position() = (2 to $half)])"><xsl:value-of select="true()" /></xsl:when>
                     <xsl:otherwise><xsl:value-of select="ct:contained-within($map, $maps[position() = (($half + 1) to last())])" /></xsl:otherwise>
                 </xsl:choose>

             </xsl:otherwise>
         </xsl:choose>

</xsl:function>

Two maps are equal iff for each singletonMap in map1 there is a singletonMap in map2 with the same value for @to and @from:

     <xsl:function name="ct:equal" as="xs:boolean">
         <xsl:param name="map1" as="element(map)"/>
         <xsl:param name="map2" as="element(map)"/>

         <xsl:choose>
             <xsl:when test="count($map1/*) ne count($map2/*)"><xsl:value-of select="false()" /></xsl:when>
             <xsl:otherwise>
                 <xsl:variable name="result">
                     <xsl:for-each select="$map1/singletonMap">
                         <xsl:variable name="here" select="." />
                         <xsl:if test="$map2/singletonMap[@from eq $here/@from and @to ne $here/@to]">false</xsl:if>
                     </xsl:for-each>
                 </xsl:variable>

                 <xsl:choose>
                     <xsl:when test="contains($result, 'false')"><xsl:value-of select="false()" /></xsl:when>
                     <xsl:otherwise><xsl:value-of select="true()" /></xsl:otherwise>
                 </xsl:choose>
             </xsl:otherwise>
         </xsl:choose>

</xsl:function>

/Roger


-- Gerrit Imsieke Geschdftsf|hrer / Managing Director le-tex publishing services GmbH Weissenfelser Str. 84, 04229 Leipzig, Germany Phone +49 341 355356 110, Fax +49 341 355356 510 gerrit.imsieke@xxxxxxxxx, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschdftsf|hrer: Gerrit Imsieke, Svea Jelonek,
Thomas Schmidt, Dr. Reinhard Vvckler

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.