[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: Collating riffled lists
The only thing that makes this O(n^2) is selecting the b by <xsl:copy-of select="/list/b[$ix]/text()" /> This is relying on the way the optimizer works, but my guess is there's a good chance that if you declare a global variable: <xsl:variable name="list-of-b" select="/list/b"/> and then replace the above with <xsl:copy-of select="$list-of-b[number($ix)]"/> then this instruction will execute in constant time rather than O(n) time. I've put in the number($ix) because $ix is a parameter, so it's type isn't known at compile time, and it really helps the optimizer (especially in Saxon 7.x, admittedly) to know in advance that the predicate is always going to be numeric. Another approach would be that when you find a b, you pass it as a parameter in the recursive call, and instead of getting the next b by index, you get it using $lastb/following-sibling::b[1]. Michael Kay > -----Original Message----- > From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx > [mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx] On Behalf Of > Mat Myszewski > Sent: 29 September 2003 18:43 > To: XSL-List@xxxxxxxxxxxxxxxxxxxxxx > Subject: Collating riffled lists > > > This is my first XSLT project. I have a recursive solution to > a problem > which I hope one of you can improve on. > > This an abstraction of a problem that arose in the context of > scraping PDF > docs. > > PDF->Adobe->HTML->tidy->XML->scrape with XSLT->... > > The PDF->HTML conversion, or for that matter, lassoing the > text in Acrobat > Reader, cutting and pasting it, yields a different order than what is > displayed on screen by Acrobat Reader. It's not so badly > mangled that it > can't be recovered. However, related items are no longer near > one another. I > need to recover the original relationship between the related items. > > I'm hoping someone can come up with a better solution that > the one I present > below, which I believe is O(n squared), where n is large (the original > document is 170+ pages). I've considered outputting the a's > and b's into two > result files with two xslt programs and processing those. I > think XSLT 1.1 > would allow this to be done within a single xslt program by > building two > node sets, but I'd like to stick to 1.0, if possible. > > XML source: > > <?xml version="1.0"?> > <list> > <a>a1</a> > <a>a2</a> > <a>a3</a> > <b>b1</b> > <b>b2</b> > <a>a4</a> > <b>b3</b> > <a>a5</a> > <b>b4</b> > <a>a6</a> > <b>b5</b> > <b>b6</b> > </list> > > The XSLT: > > <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" > version="1.0"> > > <!-- match first a and make top level call to recursive template --> > <xsl:template match="a[1]"> > <xsl:call-template name="do_a"> > <xsl:with-param name="ix" select="1" /> > </xsl:call-template> > </xsl:template> > > <!-- recursive template counts a's --> > <xsl:template name="do_a"> > <xsl:param name="ix" /> > > <!-- output this a --> > <xsl:text> > </xsl:text> > <xsl:value-of select="$ix" /><xsl:text>: </xsl:text><xsl:value-of > select="."/> > > <!-- output corresponding b --> > <xsl:text> </xsl:text><xsl:copy-of select="/list/b[$ix]/text()" /> > > <!-- This for-each moves to the next a; doesn't loop. --> > <xsl:for-each select="following-sibling::a[1]"> > > <!-- increment counter and output rest of a's --> > <xsl:call-template name="do_a"> > <xsl:with-param name="ix" select="$ix+1" /> > </xsl:call-template> > > </xsl:for-each> > </xsl:template> > > <!-- suppress other output --> > <xsl:template match="text()" /> > > </xsl:stylesheet> > > And this is the output (Saxon 6.5.1 with XFactor GUI): > > <?xml version="1.0" encoding="utf-8"?> > 1: a1 b1 > 2: a2 b2 > 3: a3 b3 > 4: a4 b4 > 5: a5 b5 > 6: a6 b6 > > Thanks, > > Mat M. > > > > > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|