[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Grouping elements that have at least one common va
This feels to me as being not so much a grouping problem, as a transitive closure problem. There's a relation between two elements ("have at least one value in common") and you're looking for the transitive closure of that relation to form one group. For efficiency you first need to find a way of evaluating the basic relation efficiently, and then you need an efficient way of finding its transitive closure. For the basic relation, start by defining a key <xsl:key name="k" match="GRCHOIX" use="CHOIX/@CODE"/> Then key('k', 'choix-1') will find all GRCHOIX elements that have 'choix-1' as one of their values. And key('k', //GRCHOIX/@CODE[.="grchoix-1"]/CHOIX/@CODE) will find all GRCHOIX elements that have either 'choix-1' or 'choix-2' as one of their values. So the base relation is defined by a function: <xsl:function name="f:step"> as="element(GRCHOIX)*"> <xsl:param name="start" as="element(GRCHOIX)"/> <xsl:sequence select="key('k', $start/CHOIX/@CODE)"/> </xsl:function> Then you need to find the transitive closure of this step. As it happens we've just three days ago defined a transitive closure function for inclusion in XPath 4.0, which you can find here: https://qt4cg.org/pr/533/xpath-functions-40/Overview.html#func-transitive-clo sure Unfortunately, you need it now! You can implement it yourself, but it's not particularly easy. One way might be like this: <xsl:iterate select="1 to 100000000"> <xsl:param name="result" as="node()*" select="()"/> <xsl:param name="origin" as="node()*" select="$start-node"/> <xsl:variable name="next-nodes" select="($origin ! f:step(.)) except $result"/> <xsl:choose> <xsl:when test="empty($next-nodes)"> <xsl:sequence select="$result"/> </xsl:when> <xsl:otherwise> <xsl:next-iteration> <xsl:with-param name="result" select="$result | $next-ndoes"/> <xsl:with-param name="origin" select="$next-nodes"/> </ </ </ which basically keeps looking for more nodes satisfying the relation until no more are found. This gives you one group starting from one specific $start-node, which you presumably select at random. Then repeat, starting from a different $start-node, one that has not already been allocated to a group; until no more nodes are left. NOT TESTED! Michael Kay Saxonica > grchoix-1 > On 16 Jun 2023, at 12:09, Matthieu Ricaud-Dussarget ricaudm@xxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > Hi all, > > I need to group elements that have at least one common value : > > <FORMS> > <!--CASE 1--> > <GRCHOIX CODE="grchoix-1"> > <CHOIX CODE="choix-1"/> > <CHOIX CODE="choix-2"/> > </GRCHOIX> > <GRCHOIX CODE="grchoix-2"> > <CHOIX CODE="choix-1"/> > <CHOIX CODE="choix-2"/> > <CHOIX CODE="choix-3"/> > </GRCHOIX> > <GRCHOIX CODE="grchoix-3"> > <CHOIX CODE="choix-2"/> > <CHOIX CODE="choix-3"/> > <CHOIX CODE="choix-4"/> > </GRCHOIX> > <!--CASE 2--> > <GRCHOIX CODE="grchoix-A"> > <CHOIX CODE="choix-a"/> > <CHOIX CODE="choix-b"/> > </GRCHOIX> > <GRCHOIX CODE="grchoix-B"> > <CHOIX CODE="choix-b"/> > <CHOIX CODE="choix-c"/> > </GRCHOIX> > <GRCHOIX CODE="grchoix-C"> > <CHOIX CODE="choix-d"/> > <CHOIX CODE="choix-e"/> > </GRCHOIX> > <GRCHOIX CODE="grchoix-D"> > <CHOIX CODE="choix-c"/> > <CHOIX CODE="choix-d"/> > </GRCHOIX> > </FORMS> > > * grchoix-1 and grchoix-2 have 2 common values : choix-1, choix-2 b they must belong the same group > * grchoix-2 and grchoix-3 have 2 common values : choix-2, choix-3 b they must belong the same group > b At the end : grchoix-1, grchoix-2 and grchoix-3 must belong the same group > > * grchoix-A and grchoix-B have 1 common value choix-b b they must belong the same group > * grchoix-B and grchoix-C have 0 common values > * but as grchoix-D have > - 1 common value (choix-c) with grchoix-B > - 1 one common value (choix-d) with grchoix-C > Then grchoix-B and grchoix-C must also belong the same GROUP > b At the end : grchoix-A, grchoix-B, grchoix-C and grchoix-D must belong the same group > > The expected XML output is 2 Groups : > <FORMS> > <!--CASE 1--> > <GROUP> > <GRCHOIX CODE="grchoix-1"> > <CHOIX CODE="choix-1"/> > <CHOIX CODE="choix-2"/> > </GRCHOIX> > <GRCHOIX CODE="grchoix-2"> > <CHOIX CODE="choix-1"/> > <CHOIX CODE="choix-2"/> > <CHOIX CODE="choix-3"/> > </GRCHOIX> > <GRCHOIX CODE="grchoix-3"> > <CHOIX CODE="choix-2"/> > <CHOIX CODE="choix-3"/> > <CHOIX CODE="choix-4"/> > </GRCHOIX> > </GROUP> > <!--CASE 2--> > <GROUP> > <GRCHOIX CODE="grchoix-A"> > <CHOIX CODE="choix-a"/> > <CHOIX CODE="choix-b"/> > </GRCHOIX> > <GRCHOIX CODE="grchoix-B"> > <CHOIX CODE="choix-b"/> > <CHOIX CODE="choix-c"/> > </GRCHOIX> > <GRCHOIX CODE="grchoix-C"> > <CHOIX CODE="choix-d"/> > <CHOIX CODE="choix-e"/> > </GRCHOIX> > <GRCHOIX CODE="grchoix-D"> > <CHOIX CODE="choix-c"/> > <CHOIX CODE="choix-d"/> > </GRCHOIX> > </GROUP> > </FORMS> > > I'm really not sure how to go ahead with such a grouping problem. > I've tried something that work on my sample but it's probably too complicated (maybe it doesnt' work all cases) and most of all it has drastic bad performances ! > I'm using fro-each-group on multiple values, wich causes all combination to be executed and make the internal variables huge (from A 8Mo input I get a 500 Mo size of intermediate step) > At this point I get without surprise a saxon internal error on my real (rather big) XML input. > > See my (so) bad solution in a next message (I get a message size to long when I copy it here) > > I'm pretty sure there's something quite more elegant to do, did you ever had to do such a grouping ? > BTW I'm using XSLT 3. > > Any suggestion/help appreciated > > Cheers, > > Matthieu Ricaud-Dussarget > > > > XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list> > EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/3500899> (by email <>)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|