Re: Calculating groups of repeating elements

Play the video

Subject: Re: Calculating groups of repeating elements
From: Quinn Dombrowski <qdombrow@xxxxxxxx>
Date: Thu, 11 Dec 2008 15:06:57 -0600

Thanks a ton Wendell, Michael L and Michael K! You've given me quite a lot to chew on. I'm going to give it a shot on my real data set (a pile of Cyrillic with extra diacritics and linguistic symbols) and let you know how it goes.

Wendell Piez wrote:

Hi,

At 12:58 PM 12/11/2008, Michael wrote:
It seems to me that if you are wanting to collect groups of 2+ words
that appear in 2+ places, a useful first step would be to collect the
set of intersections of words occuring in every pairing of places.
This would be a large number, n(n-1)/2 for n places, but not the huge
exponent of 2 cited by Michael, and hence possibly a more direct route
to your goal.
Great! This looks like a much more useful approach to the problem!
Thanks ... I hope so.

BTW, since writing that it has also occurred to me that by declaring a key that would return places based on descendant word elements, one could speed up the generation of this set and avoid empty intersections. So:

<xsl:key name="place-by-word" match="place" use=".//word"/>

<xsl:template match="atlas"> <collection> <xsl:for-each select="place"> <xsl:variable name="first" select="."/> <xsl:for-each select="key('place-by-word',.//word)[. << $first]"> <xsl:variable name="second" select="."/> <common_words> <xsl:copy-of select="$first/place_number, $second/place_number"/> <words> <xsl:copy-of select="$first/words/word[.=$second/words/word]"/> </words> </common_words> </xsl:for-each> </xsl:for-each> </collection> </xsl:template>

(This requires testing, of course.)
While this isn't quite what you want, the results you want could be
derived by grouping these lists further, skipping pairings that
contain less than two 'word' elements, and collecting together those
have have the same sets (and thus represent sets of words that occur
in more than two places).
Yes. But I think you must still generate the subsets, because if you
have, say, three occurrences of (a,b,c) and two of (a,b,d), you have
five occurrences of (a,b), which is interesting, if my understanding of
the requirement is correct.
This is a good point; only the OP can say if it's in scope.

(Hm: could this be done by recursing to intersect among the intersections, dropping singleton cases along the way? The overload warning lamp in my brain is now starting to flash.)

This continues to be interesting.
Yes, it does.
Cheers,
Wendell
======================================================================
Wendell Piez                            mailto:wapiez@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================

Current Thread

Re: Calculating groups of repeating elements, (continued)
- Wendell Piez - 11 Dec 2008 16:04:41 -0000
- Message not available
  - Michael Ludwig - 11 Dec 2008 17:59:01 -0000
    - Wendell Piez - 11 Dec 2008 20:19:37 -0000
    - Message not available
    - Quinn Dombrowski - 11 Dec 2008 21:15:27 -0000 <=
- Michael Kay - 11 Dec 2008 19:56:29 -0000
  - Michael Kay - 11 Dec 2008 20:11:41 -0000

<- Previous	Index	Next ->
Re: Calculating groups of rep, Wendell Piez	Thread	RE: Calculating groups of rep, Michael Kay
XSL and UML, Philip Vallone	Date	Multi condition sum, peter verhaar
	Month

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Subscribe in XML format

RSS 2.0
Atom 0.3

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >