[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Sorting chemical formulae in XSLT 2.0
Hello, I think regexp would help. While it's been a while since I have had to deal with chemical elements, and am therefore not sure I completely understand your requirements, the following stylesheet gives the expected result: <xsl:template match="list"> <xsl:for-each select="*"> <xsl:sort select="ms:molSort2(.)"/> <xsl:copy-of select="."/> </xsl:for-each> </xsl:template> <xsl:function name="ms:molSort2"> <xsl:param name="node"/> <xsl:variable name="filter"><!-- take out unwanted characters and only keep letters and numbers --> <xsl:analyze-string select="string($node)" regex="[A-Za-z0-9]+"> <xsl:matching-substring> <xsl:value-of select="."/> </xsl:matching-substring> </xsl:analyze-string> </xsl:variable> <xsl:variable name="sortString"> <!-- does two things: pads numbers, and transforms letters to their code, so that at the end we only have a long string of numbers --> <xsl:analyze-string select="$filter" regex="\d+"> <xsl:matching-substring><!-- this is a number --> <xsl:value-of select="format-number(number(.), '000')"/> </xsl:matching-substring> <xsl:non-matching-substring><!-- (at this point) this is a character --> <xsl:value-of select="string-to-codepoints(.)"/> </xsl:non-matching-substring> </xsl:analyze-string> </xsl:variable> <xsl:value-of select="$sortString"/> </xsl:function> Hope this helps. Regards, EB On Wed, Nov 24, 2010 at 5:55 PM, Emma Burrows <Emma.Burrows@xxxxxxxxxxx> wrote: > Hello, > > Using Saxon 9.2 and XSLT 2.0, I am currently sorting a list of chemical formulae which appears in the following format: > > <list> > <item1>(C<sub>19</sub>H<sub>22</sub>N<sub>2</sub>O)<sub>2</sub>,H<sub>2</sub> SO<sub>4</sub>,7H<sub>2</sub>O</item1> > <item1>C<sub>4</sub>H<sub>7</sub>Cl<sub>3</sub>O<sub>2</sub></item1> > <item1>CHCl<sub>3</sub></item1> > <item1>CNa<sub>3</sub>O<sub>5</sub>P </item1> > </list> > > The desired sort order is: > > CHCl3 > CNa3O5P > C4H7Cl3O2 > (C19H22N2O)2,H2SO4,7H2O > > So the rules are > a. ignore brackets > b. sort letters before numbers > c. sort numbers numerically > > Using the following templates, I've managed to get as far as a and b, but I need a little help adding c to the mix: > > <xsl:template match="list"> > <xsl:for-each select="item1"> > <xsl:sort select="rps:molSort(item1)" case-order="upper-first"/> > <xsl:copy-of select="item1"/> > </xsl:for-each> > </xsl:template> > > <xsl:function name="rps:molSort" as="xs:string"> > <xsl:param name="node"/> > <xsl:variable name="step1" select="replace(replace($node, '\(',''), '\)','')"/> > <xsl:variable name="step2" select="replace(replace($step1, '\[',''), '\]','')"/> > <xsl:variable name="step3" select="translate($step2,'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxy z0123456789','0123456789aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ' )"/> > <xsl:value-of select="$step3"/> > </xsl:function> > > This produces the following output: > CHCl3 > CNa3O5P > (C19H22N2O)2,H2SO4,7H2O > C4H7Cl3O2 > > In other words, numbers are sorted as letters rather than numbers, so the subscripts go "1 10 11 2 3.." instead of "1 2 3... 10 11". I need an additional criterion somewhere to sort the numbers correctly but I haven't found a solution that works yet, so a nudge in the right direction would be great. > > Thank you!
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|