[XSL-LIST Mailing List Archive Home]
[By Thread]
[By Date]
[Recent Entries]
[Reply To This Message]
Re: removing duplicate elements based on two or more
Subject: Re: removing duplicate elements based on two or more childs
From: Manuel Souto Pico <manuel.souto@xxxxxxxxxxxxxx>
Date: Tue, 25 Aug 2009 16:55:57 +0200
|
Hi again,
I have also tried:
<xsl:template match="record">
<xsl:if test="//term[not (. = preceding::term)]">
<note>This is a new record</note>
<xsl:apply-templates select="child::lang" />
</xsl:if>
etc.
but to no avail.
Thanks, Manuel
Manuel Souto Pico wrote:
Hi,
I had tried to get this done already some long ago, but due to lack of
time and the difficulty of the task, I dropped it and did it by other
less elegant means. Now I neede it again and I think it deserves a
chance, with a bit of help I'm sure I'll manage to get it done and it
will be extremely useful both for me and I guess for a lot of people.
A simplified explanation. In an XML file I have records, which contain
languages, which contain terms. The path down to any term would be
/doc/body/text/record/lang/term. For example:
<record id="1">
<lang id="fra">
<term>banque</term>
</lang>
<lang id="eng">
<term>bank</term>
</lang>
</record>
<record id="2">
<lang id="fra">
<term>banque</term>
</lang>
<lang id="eng">
<term>bench</term>
</lang>
</record>
<record id="3">
<lang id="fra">
<term>banque</term>
</lang>
<lang id="eng">
<term>bank</term>
</lang>
</record
As you can see, the French term is the same in the three records. If
we applied a duplicate removing function based on the French term, we
would end up with only one record. However, what I need is to remove
duplicates taking into account the terms in all languages, here only
two (French and English), but it should be extensible to n languages.
The expected outcome would contain only records 1 and 3 (that is, two
distinct concepts, the bench and the bank). Notice that it is the
whole parent element record which must not be generated if the terms
that it contains are duplicates, not just the children, so this is not
the ideal outcome:
<record id="1">
<lang id="fra">
<term>banque</term>
</lang>
<lang id="eng">
<term>bank</term>
</lang>
</record>
<record id="2">
<lang id="fra">
<term>banque</term>
</lang>
<lang id="eng">
<term>bench</term>
</lang>
</record>
<record id="3">
<lang id="fra"/>
<lang id="eng"/>
</record
I've tried using <xsl:apply-templates select="child::seg[not (. =
preceding-sibling::seg)]" /> but, say,
/record[@id=1]/lang[@id="fra"]/term and
/record[@id=3]/lang[@id="fra"]/term are not siblings because they
don't have the same parent. Perhaps this wasn't the best way to go
anyway.
Any suggestion would be more than welcome. Thanks a lot.
Have nice evening,
|
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format
RSS 2.0 |
|
Atom 0.3 |
|
|