[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Spotting "cousin marriages" in a tree

Subject: Spotting "cousin marriages" in a tree
From: Phil Endecott <spam_from_xslt_list@xxxxxxxxxxxx>
Date: Wed, 28 Jul 2004 22:18:50 +0100
cousin tree
Dear XSLT experts,

I have a flat input like this:

<things>
  <thing id="1"> <child idref="2"/> <child idref="3"/> </thing>
  <thing id="2"/>
  <thing id="3"/>
</things>

I want to match up the ids with the idrefs to build a tree that looks like this:

<tree id="1">
  <tree id="2"/>
  <tree id="3"/>
</tree>

I could do it like this:

<xsl:template match="thing">
  <tree>
    <xsl:for-each select="child">
      <xsl:apply-templates
           select="/things/thing[@id=current()/@idref]"/>
    </xsl:for-each>
  </tree>
</xsl:template>

But that is O(n^2), so I'd instead do it like this:

<xsl:key name="things-by-id" match="thing" use="@id"/>
<xsl:template match="thing">
  <tree>
    <xsl:for-each select="child">
      <xsl:apply-templates select="key('things-by-id',@idref)"/>
    </xsl:for-each>
  </tree>
</xsl:template>

which is O(n log n).

This is all fine. But very rarely my input may not describe a pure tree; it may contain "cousin marriages" or more "incestuous" relationships like this:

<things>
  <thing id="1"> <child idref="2"/> <child idref="3"/> </thing>
  <thing id="2"> <child idref="4"/> </thing>
  <thing id="3"> <child idref="4"/> </thing>
  <thing id="4"/>
</things>

If you apply either of the above templates to this, thing 4 appears in the output twice:

<tree id="1">
  <tree id="2">
    <tree id="4"/>
  </tree>
  <tree id="3">
    <tree id="4">
  </tree>
</tree>

which is very bad in my application. What I need is something like this:

<tree id="1">
  <tree id="2">
    <tree id="4"/>
  </tree>
  <tree id="3">
    <error/>
  </tree>
</tree>

I can't afford for this to become an O(n^2) operation.

The obvious thought is that I need to check at the point of recursion whether the thing I am about to process has already been output. But there is no "preceding-in-the-output" axis or other way I can think of to note which things have already been output.

An alternative is to apply a test to the other input elements to see if other things have the same thing as this one as a child, but that fails because my input may legitimately contain redundant nodes like this:

<things>
  <thing id="1"> <child idref="2"/> <child idref="3"/> </thing>
  <thing id="2"/>
  <thing id="3"/>
  <thing id="9"> <child idref="2"/> </thing>
</things>

In this case I want the same output as the very first example, i.e. thing 9 doesn't appear in the output at all.

I can do a multi-pass process using exsl:node-set(), generating the tree with the duplicates in it and then deleting them if they are duplicates, i.e. something like this:

<xsl:template match="/">
  <xsl:variable name="n">
    <xsl:apply-templates select="things/thing[1]"/>  (as above)
  </xsl:variable>
  <xsl:apply-templates select="exsl:node-set($n)" mode="remove-dupes"/>
</xsl:template>

<xsl:template match="tree" mode="remove-dupes">
  <xsl:choose>
    <xsl:when test="preceding::tree[@id=current/@id]">
      <error/>
    </xsl:when>
    <xsl:otherwise>
      <tree id="{@id}">
        <xsl:apply-templates mode="remove-dupes"/>
      </tree>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

There are two problems with this. First, generating the intermediate result could run forever if the input is seriously malformed (e.g. a loop, child refers to parent); checking as it is generated would avoid this. Second, the test "preceding::tree[@id=current/@id]" is O(n^2). Can I use a key to avoid this? How can a key be applied to an exsl:node-set() result?

XSLT often needs some lateral thinking to come up with a good solution. I hope that someone who's read this far has a trick to solve this for me.

Thanks in advance!

--Phil.

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.