[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Collating riffled lists

Subject: Collating riffled lists
From: "Mat Myszewski" <mmyszew@xxxxxxxxxxx>
Date: Mon, 29 Sep 2003 13:42:56 -0400
riffled
This is my first XSLT project. I have a recursive solution to a problem
which I hope one of you can improve on.

This an abstraction of a problem that arose in the context of scraping  PDF
docs.

PDF->Adobe->HTML->tidy->XML->scrape with XSLT->...

The PDF->HTML conversion, or for that matter, lassoing the text in Acrobat
Reader, cutting and pasting it, yields a different order than what is
displayed on screen by Acrobat Reader. It's not so badly mangled that it
can't be recovered. However, related items are no longer near one another. I
need to recover the original relationship between the related items.

I'm hoping someone can come up with a better solution that the one I present
below, which I believe is O(n squared), where n is large (the original
document is 170+ pages). I've considered outputting the a's and b's into two
result files with two xslt programs and processing those. I think XSLT 1.1
would allow this to be done within a single xslt program by building two
node sets, but I'd like to stick to 1.0, if possible.

XML source:

<?xml version="1.0"?>
<list>
<a>a1</a>
<a>a2</a>
<a>a3</a>
<b>b1</b>
<b>b2</b>
<a>a4</a>
<b>b3</b>
<a>a5</a>
<b>b4</b>
<a>a6</a>
<b>b5</b>
<b>b6</b>
</list>

The XSLT:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">

<!-- match first a and make top level call to recursive template -->
<xsl:template match="a[1]">
    <xsl:call-template name="do_a">
        <xsl:with-param name="ix" select="1" />
    </xsl:call-template>
</xsl:template>

<!-- recursive template counts a's -->
<xsl:template name="do_a">
    <xsl:param name="ix" />

    <!-- output this a -->
    <xsl:text>
</xsl:text>
    <xsl:value-of select="$ix" /><xsl:text>: </xsl:text><xsl:value-of
select="."/>

    <!-- output corresponding b -->
    <xsl:text> </xsl:text><xsl:copy-of select="/list/b[$ix]/text()" />

    <!-- This for-each moves to the next a; doesn't loop. -->
    <xsl:for-each select="following-sibling::a[1]">

        <!-- increment counter and output rest of a's -->
        <xsl:call-template name="do_a">
            <xsl:with-param name="ix" select="$ix+1" />
        </xsl:call-template>

    </xsl:for-each>
</xsl:template>

<!-- suppress other output -->
<xsl:template match="text()" />

</xsl:stylesheet>

And this is the output (Saxon 6.5.1 with XFactor GUI):

<?xml version="1.0" encoding="utf-8"?>
1: a1 b1
2: a2 b2
3: a3 b3
4: a4 b4
5: a5 b5
6: a6 b6

Thanks,

            Mat M.




 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.