[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Removing duplicates where duplicates are determine

Subject: RE: Removing duplicates where duplicates are determined by the concatenation of two elements
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Tue, 18 Dec 2007 15:28:12 -0000
RE:  Removing duplicates where duplicates are determine
> My question is:
> Can anyone tell me what is wrong with my XSLT(se below)?

A great deal. First of all, the obvious way to tackle this is using
<xsl:for-each-group>. Apart from that:


> xmlns:fn="http://www.w3.org/2005/xpath-functions">

You never need to declare this namespace. It's the default namespace for
functions; when you call standard functions in XSLT you don't need to use
any prefix. I know Altova creates this namespace declaration automatically,
but you should get rid of it, it's unwanted noise.

>                                <xsl:template match="/ | 
> node() | @* | comment() | processing-instruction()">

You don't need comment() or processing-instruction(); those nodes are
already matched by virtue of node().

In fact you don't need this template rule, since you are never invoking it.

>   <xsl:variable name="persons">
>      <xsl:for-each select="//person">
>          <xsl:copy-of select="."/>

Why are you going to such efforts to copy the data when you could work with
the original? You're also using //person when you could write
/persons/person which would almost certainly be more efficient. You just
want

<xsl:variable name="persons" select="/persons/person"/>

(though the variable in this case doesn't really add much value).

>  <xsl:for-each select="$persons/person">

$persons is a sequence of person elements. Person elements don't have
children called person, so the select will select nothing. Change it to
select="$persons".

>     <xsl:variable name="pos" select="position( )"/>
>     <xsl:if test="$pos = 1 or concat(./first_name,./surname)
!=concat(./first_name[$pos - 1],./surname[$pos - 1])">

Apart from the fact that you are hand-coding <xsl:for-each-group>, and
assuming the correction above:

(a) A person only has one surname, so ./surname[$pos - 1] selects nothing.
What you should be comparing with is $persons[$pos - 1]/surname

(b) Do you really want to compare the concatenation, that is to treat ANN
EWING as a duplicate of ANNE WING? Why not compare the surname and firstname
independently?

(c) The $pos=1 test is redundant. If $pos is 1, the test for equality of
names will automatically be false.
>                                                               
>                 <xsl:copy-of select="."/>
>                                                               
Even without xsl:for-each-group, your logic could be simplified to

<xsl:template match="/">
<xsl:copy-of select="/persons/person[not(first-name =
preceding-sibling::person[1]/first-name and surname =
preceding-sibling::person[1]/surname)]"/>
</xsl:template>

That's essentially the whole stylesheet...

Michael Kay
http://www.saxonica.com/

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.