[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Clean data using XSLT

Subject: Re: Clean data using XSLT
From: Jeni Tennison <jeni@xxxxxxxxxxxxxxxx>
Date: Mon, 07 Aug 2000 23:28:35 +0100
xsl sort select count
Michal,

>I use this XSLT on it:
><xsl:for-each select='//Property[not(.=preceding::Property)]'>
>     <xsl:sort select="@Id" />
>
>     <xsl:variable name="PropertyAddress" select="text()" />
>
>     <BR />
>     <xsl:value-of select="@Id" />) <xsl:value-of select="." />
>     (appears <xsl:value-of select="count(//Property[.=$PropertyAddress])" />
>times)
>
></xsl:for-each>
[snip]
>I would like to have the data also sorted by the number of times this
version of
> the address appears in the data, so my output would be:

You've managed to describe what you want to do, which is always the
majority of the work :)  Let me show you my rudimentary translation:

"sorted"                      -> xsl:sort
"the number of times"         -> count()
"this version of the address" -> current()/text() [or just current()]
"the data"                    -> //Property

The data are the Property elements: //Property

You are only interested in the ones that have the same content as the
content of the item you're currently sorted.  When you're only interested
in a subset of nodes, this means a predicate.  The predicate needs to test
the content of the particular Property element against the content of the
node you're sorting:

  //Property[. = current()]

Then you want to count the number of nodes in this set:

  count(//Property[. = current()])

So, if you add another sort using this as the 'select' expression and
making sure to sort in descending order so that the ones that appear most
frequently appear first in the list, you can do it:

<xsl:for-each select='//Property[not(.=preceding::Property)]'>
     <xsl:sort select="@Id" />
     <xsl:sort select="count(//Property[. = current()])" order="descending" />

     <xsl:variable name="PropertyAddress" select="text()" />

     <BR />
     <xsl:value-of select="@Id" />) <xsl:value-of select="." />
     (appears <xsl:value-of select="count(//Property[.=$PropertyAddress])" />
times)

</xsl:for-each>

A slight change that I'd make for performance is to give the direct path to
the Property elements rather than using '//'.  It can be quite laborious
for processors to search the entire tree, and as you don't have nested
Property elements all over the place, there's no need to do so:

<xsl:for-each select='/RootNode/Property[not(.=preceding::Property)]'>
     <xsl:sort select="@Id" />
     <xsl:sort select="count(/RootNode/Property[. = current()])" 
               order="descending" />

     <xsl:variable name="PropertyAddress" select="text()" />

     <BR />
     <xsl:value-of select="@Id" />) <xsl:value-of select="." />
     (appears <xsl:value-of
select="count(/RootNode/Property[.=$PropertyAddress])" />
times)

</xsl:for-each>

The other thing is that I'd think about using keys to index the Property
elements both by @Id and by their content, so that it's easy to (a)
identify the first Property with a particular @Id (b) identify all the
other Properties with that @Id and (c) count how many Properties there are
with that same address:

<xsl:key name="property-ids" match="Property" select="@Id" />
<xsl:key name="property-address" match="Property" select="." />

...
<xsl:for-each select='/RootNode/Property[generate-id() =
generate-id(key('property-ids', @Id)[1])]'>
     <xsl:sort select="@Id" />
     <xsl:sort select="count(key('property-address', .))"
order="descending" />

     <xsl:variable name="PropertyAddress" select="text()" />

     <BR />
     <xsl:value-of select="@Id" />) <xsl:value-of select="." />
     (appears <xsl:value-of select="count(/key('property-address', .))" />
times)

</xsl:for-each>
...

I hope this helps,

Jeni

Dr Jeni Tennison
Epistemics Ltd * Strelley Hall * Nottingham * NG8 6PE
tel: 0115 906 1301 * fax: 0115 906 1304 * email: jeni.tennison@xxxxxxxxxxxxxxxx


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.