[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Efficently transposing tokenized data

Subject: RE: Efficently transposing tokenized data
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Tue, 4 Nov 2008 23:44:13 -0000
RE:  Efficently transposing tokenized data
I can suggest several approaches, but I don't guarantee that any of them
will perform better than doing the repeated (wasteful) tokenization.

(1) Do a preprocessing pass in which you split the data attribute into
multiple elements, then proceed "as normal".

(2) Do a preprocessing pass to compute a sequence of NxM strings in one big
sequence, then operate by indexing into this big sequence.

(3) Write a user-defined function that calls tokenize() but with
saxon:memo-function="yes", so that the results of tokenizing a node are
remembered when you tokenize the same node again.

I think I would probably go for (2) as it's simplest:

<xsl:variable name="bigArray" select="for $x in Line return
tokenize($x/@data, '\s')"/>
<xsl:variable name="columns" select="count(Line)"/>

<xsl:for-each select="1 to xs:integer(@samples)">
  <xsl:variable name="row" select="."/>
    <xsl:for-each select="1 to $columns">
      <xsl:variable name="col" select="."/>
      <td><xsl:value-of select="$bigArray[(:some function of $row and
$column, an exercise for the reader:)]

Michael Kay

> -----Original Message-----
> From: Beldaz Jalfrezi [mailto:beldazj@xxxxxxxxxxxx] 
> Sent: 04 November 2008 23:13
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject:  Efficently transposing tokenized data
> Hi XSL-List members,
> I have been learning my way around XSLT (2.0) over the past 
> couple of weeks, principally using Saxon-B 9.1. I have an 
> immediate need to transform existing XML data, initially to 
> HTML. The XML contains, similar to SVG, space-tokenized 
> numerical data held within attributes. What I am having 
> trouble understanding is how to transform several nodes with 
> such data into individual columns of data.
> The XML is like this:
> <MultiLine samples="5">
> <Line title="Line 1" data="0.1 0.4 0.6 0.4 0..8" /> <Line 
> title="Line 2" data="0.4 0.2 0.5 1.4 3.8" /> </MultiLine>
> There can be any number of <Line> nodes (typically 3 or 4) 
> and the number of values (samples) in each data attribute for 
> each <Line> must match the samples attribute of the parent 
> <MultiLine> node. The number of samples can be very long, 
> typically several thousand. At present this is the data that 
> I am being supplied, so regardless of any shortcomings of the 
> structure (although suggestions for improvement are welcome) 
> I am stuck with it.
> I want to transform such data into, e.g., HTML columns, like:
> <TR>
> <TH>Line 1</TH><TH>Line 2</TH>
> </TR>
> <TR>
> <TD>0.1</TD><TD>0.4</TD>
> </TR>
> <TR>
> <TD>0.4</TD><TD>0.2</TD>
> </TR>
> <TR>
> <TD>0.6</TD><TD>0.5</TD>
> </TR>
> ...
> </TABLE>
> I have seen similar questions where people have wanted to 
> transpose HTML table data, but the problem here is that the 
> input data has to be tokenized. I can output the data 
> sequentially for each Line with a template such as:
>   <xsl:template match="Line">
> <TR><TH><xsl:value-of select="@title" /></TH><TR>
>       <xsl:for-each select="tokenize(@data,' ')">
>         <TR><TD>
>             <xsl:value-of select="position()" />
>           </TD></TR>
>             <xsl:value-of select="." />
>       </xsl:for-each>
>   </xsl:template>
> To format the data as adjacent columns the only approach I 
> can think of is a for-each loop over the number of samples, 
> and in each case call index-at(tokenize(../Line[n]/@data, ' 
> '), position()) for each Line. But this looks horribly 
> inefficient, since I appear to be tokenizing every @data 
> attribute repeatedly, presumably giving complexity 
> O(count(Line)*samples^2). What would be far more preferable 
> is to tokenize all the lines, and then at each index obtain 
> the appropriate value from each sequence. But I have no idea 
> how to do that in XSLT.
> I have spent at least a day trying to solve this conumdrum, 
> so if anyone has any suggestions I would be very grateful.
> Kind regards,
> Beldaz
>       Search 1000's of available singles in your area at the 
> new Yahoo!7 Dating. Get Started 
> http://au.dating.yahoo.com/?cid=53151&pid=1011

Current Thread


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.