[Home] [By Thread] [By Date] [Recent Entries]
Hi Quinn,
I'm not sure I follow the requirement perfectly either, but I thought this was interesting enough to give it some thought over night (I find puzzles can be relaxing), and maybe this idea would complement what Michael has suggested. It seems to me that if you are wanting to collect groups of 2+ words that appear in 2+ places, a useful first step would be to collect the set of intersections of words occuring in every pairing of places. This would be a large number, n(n-1)/2 for n places, but not the huge exponent of 2 cited by Michael, and hence possibly a more direct route to your goal. That is, for data: <atlas>
<place>
<place_number>1</place_number>
<words>
<word>Aa</word>
<word>C</word>
<word>Qqq</word>
</words>
</place>
<place>
<place_number>2</place_number>
<words>
<word>Aa</word>
<word>Bbbb</word>
<word>C</word>
<word>W</word>
<word>Zz</word>
</words>
</place> <place>
<place_number>3</place_number>
<words>
<word>Aa</word>
<word>C</word>
<word>Bb</word>
<word>Qqq</word>
<word>Wwww</word>
<word>Zz</word>
</words>
</place></atlas> this template <xsl:template match="atlas">
<collection>
<xsl:for-each select="place">
<xsl:variable name="first" select="."/>
<xsl:for-each select="preceding-sibling::place">
<xsl:variable name="second" select="."/>
<common_words>
<xsl:copy-of select="$first/place_number, $second/place_number"/>
<words>
<xsl:copy-of select="$first/words/word[.=$second/words/word]"/>
</words>
</common_words>
</xsl:for-each>
</xsl:for-each>
</collection>
</xsl:template>yields this result: <?xml version="1.0" encoding="UTF-8"?>
<collection>
<common_words>
<place_number>2</place_number>
<place_number>1</place_number>
<words>
<word>Aa</word>
<word>C</word>
</words>
</common_words>
<common_words>
<place_number>3</place_number>
<place_number>1</place_number>
<words>
<word>Aa</word>
<word>C</word>
<word>Qqq</word>
</words>
</common_words>
<common_words>
<place_number>3</place_number>
<place_number>2</place_number>
<words>
<word>Aa</word>
<word>C</word>
<word>Zz</word>
</words>
</common_words>
</collection>You didn't say how many places you have, so I don't know how large the set will get. While this isn't quite what you want, the results you want could be derived by grouping these lists further, skipping pairings that contain less than two 'word' elements, and collecting together those have have the same sets (and thus represent sets of words that occur in more than two places). If this approach is unsound, I'm sure a friendly mathematician can explain why. :-> I hope this helps, Wendell At 03:15 PM 12/10/2008, you wrote: Hello, ====================================================================== Wendell Piez mailto:wapiez@xxxxxxxxxxxxxxxx Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9635 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ======================================================================
|

Cart



