[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Finding unique nodes in a non-sibling nodeset

Subject: RE: Finding unique nodes in a non-sibling nodeset
From: "Michael Kay" <michael.h.kay@xxxxxxxxxxxx>
Date: Sun, 30 Jun 2002 20:14:45 +0100
finding unique nodes xpath
> In a code generation transform that I am working on, I 
> frequently encounter situations where I need to eliminate 
> duplicate expressions or event calls. The nodes with the 
> commonality to be detected are often scattered around 
> different parts of a large (preprocessed) reference document 
> that is loaded with a document call.
> 
> Previously, I had eliminated duplicates with something of the 
> form  $list[not(@key1=preceding-sibling::*/@key1)]
> or
>  $list[not(@key1=preceding::*/@key1)]
> ... If I wanted to look back through the whole document.
> 
> In this situation however, the nodes to be duplicate-trimmed are
> 
> [A] Selected out of the reference document in very specific contextual
>   ways (e.g. deep inside xsl:template / xsl:for-each usages) 
> [B] Not all sibling nodes [C] The preceding axis can't be 
> used since it looks at the whole
>     preceding area of the document, not just my carefully 
> selected nodes. [D] The definition of duplication requires 
> use of multiple node
>     attributes.  i.e. needs a composite key.
> 
> Even if [D] were not true, the "preceding-sibling" axis 
> approach would not work because of [B] and the "preceding" 
> axis approach would not work because of [C].

Muenchian grouping should be able to cope with this, provided (a) all
the nodes are in the same document, and (b) you can code the rules for
"carefully selecting" the nodes in a match pattern. You can handle
composite keys using concatenation.

Where these conditions aren't true, the usual approach is to build a
temporary tree containing copies of the selected nodes. You can then use
Muenchian grouping on this tree, accessing it using the xx:node-set()
extension function.

> 
> I eventually hit on a way to solve this (since I use Saxon) 
> using saxon:tokenize. But I always wondered if there was a 
> non-extension way to do it.
> 
> What I did was build an aggregate string with delimiters from 
> the nodes in the set in question (in a variable called 
> "$list"), like so ...
> 
>   <xsl:variable name="aggregate">
>     <xsl:for-each select="$list">
>       <xsl:value-of select="concat(@key1,'/',@key2)" />
>       <xsl:if 
> test="not(position()=last())"><xsl:text>#</xsl:text></xsl:if>
>     </xsl:for-each>
>   </xsl:variable>
> 
> Then use tokenize to get a node set ...
> 
>  <xsl:variable name="list4" select="saxon:tokenize($aggregate,'#')"/>
> 
> And eliminate the duplicates the standard (?) way with
> 
>  <xsl:variable name="list4NoDups" 
> select="$list4[not(.=preceding-sibling::*)]"/>

Innovative, but as you say, if you're going to use extensions,
saxon:distinct() does the job more directly.

> There are features in Saxon 7.1 that we are very interested 
> in, so I needed to try to find a different technique.
> 
XPath 2.0 offers a distinct-values() function, but it's not yet
available in Saxon. What you can use, however, is <xsl:for-each-group>.
I think this should solve your problem fairly directly.

<xsl:for-each-group select="$list" group-by="concat(@key1, '/', @key2)">
  ...

This will iterate once for each distinct value of the group-by key, with
the context node being the first node in $list that has that key value.

Michael Kay
Software AG
home: Michael.H.Kay@xxxxxxxxxxxx
work: Michael.Kay@xxxxxxxxxxxxxx 


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.