[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Sorting chemical formulae in XSLT 2.0

Subject: Re: Sorting chemical formulae in XSLT 2.0
From: Emmanuel Bégué <eb@xxxxxxxxxx>
Date: Thu, 25 Nov 2010 10:17:58 +0100
Re:  Sorting chemical formulae in XSLT 2.0
Hello,

I think regexp would help. While it's been a while since I have had to
deal with chemical elements, and am therefore not sure I completely
understand your requirements, the following stylesheet gives the
expected result:

<xsl:template match="list">
	<xsl:for-each select="*">
		<xsl:sort select="ms:molSort2(.)"/>
		<xsl:copy-of select="."/>
		</xsl:for-each>
	</xsl:template>

<xsl:function name="ms:molSort2">
	<xsl:param name="node"/>
	<xsl:variable name="filter"><!-- take out unwanted characters and
only keep letters and numbers -->
		<xsl:analyze-string select="string($node)" regex="[A-Za-z0-9]+">
			<xsl:matching-substring>
				<xsl:value-of select="."/>
				</xsl:matching-substring>
			</xsl:analyze-string>
		</xsl:variable>
	<xsl:variable name="sortString">
		<!-- does two things: pads numbers, and transforms letters to their
code, so that at the end
		we only have a long string of numbers -->
		<xsl:analyze-string select="$filter" regex="\d+">
			<xsl:matching-substring><!-- this is a number -->
				<xsl:value-of select="format-number(number(.), '000')"/>
				</xsl:matching-substring>
			<xsl:non-matching-substring><!-- (at this point) this is a character -->
				<xsl:value-of select="string-to-codepoints(.)"/>
				</xsl:non-matching-substring>
			</xsl:analyze-string>
		</xsl:variable>
	<xsl:value-of select="$sortString"/>
	</xsl:function>

Hope this helps.
Regards,
EB


On Wed, Nov 24, 2010 at 5:55 PM, Emma Burrows <Emma.Burrows@xxxxxxxxxxx>
wrote:
> Hello,
>
> Using Saxon 9.2 and XSLT 2.0, I am currently sorting a list of chemical
formulae which appears in the following format:
>
> <list>
>  
<item1>(C<sub>19</sub>H<sub>22</sub>N<sub>2</sub>O)<sub>2</sub>,H<sub>2</sub>
SO<sub>4</sub>,7H<sub>2</sub>O</item1>
>   <item1>C<sub>4</sub>H<sub>7</sub>Cl<sub>3</sub>O<sub>2</sub></item1>
>   <item1>CHCl<sub>3</sub></item1>
>   <item1>CNa<sub>3</sub>O<sub>5</sub>P </item1>
> </list>
>
> The desired sort order is:
>
> CHCl3
> CNa3O5P
> C4H7Cl3O2
> (C19H22N2O)2,H2SO4,7H2O
>
> So the rules are
> a. ignore brackets
> b. sort letters before numbers
> c. sort numbers numerically
>
> Using the following templates, I've managed to get as far as a and b, but I
need a little help adding c to the mix:
>
> <xsl:template match="list">
>   <xsl:for-each select="item1">
>     <xsl:sort select="rps:molSort(item1)" case-order="upper-first"/>
>     <xsl:copy-of select="item1"/>
>   </xsl:for-each>
> </xsl:template>
>
> <xsl:function name="rps:molSort" as="xs:string">
>    <xsl:param name="node"/>
>    <xsl:variable name="step1" select="replace(replace($node, '\(',''),
'\)','')"/>
>    <xsl:variable name="step2" select="replace(replace($step1, '\[',''),
'\]','')"/>
>    <xsl:variable name="step3"
select="translate($step2,'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxy
z0123456789','0123456789aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ'
)"/>
>    <xsl:value-of select="$step3"/>
> </xsl:function>
>
> This produces the following output:
> CHCl3
> CNa3O5P
> (C19H22N2O)2,H2SO4,7H2O
> C4H7Cl3O2
>
> In other words, numbers are sorted as letters rather than numbers, so the
subscripts go "1 10 11 2 3.." instead of "1 2 3... 10 11". I need an
additional criterion somewhere to sort the numbers correctly but I haven't
found a solution that works yet, so a nudge in the right direction would be
great.
>
> Thank you!

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2011 All Rights Reserved.