[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Grouping elements that have at least one common va

Subject: Re: Grouping elements that have at least one common value
From: "John Lumley john@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 16 Jun 2023 13:29:24 -0000
Re:  Grouping elements that have at least one common va
Doesnbt the xsl:iterate solution need an xsl:break under the xsl:when
test=bempty($next-nodes)b, Mike?

John Lumley

Sent from my iPad

> On 16 Jun 2023, at 13:00, Michael Kay michaelkay90@xxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> o;?This feels to me as being not so much a grouping problem, as a transitive
closure problem. There's a relation between two elements ("have at least one
value in common") and you're looking for the transitive closure of that
relation to form one group.
>
> For efficiency you first need to find a way of evaluating the basic relation
efficiently, and then you need an efficient way of finding its transitive
closure.
>
> For the basic relation, start by defining a key
>
> <xsl:key name="k" match="GRCHOIX" use="CHOIX/@CODE"/>
>
> Then key('k', 'choix-1') will find all GRCHOIX elements that have 'choix-1'
as one of their values.
>
> And key('k', //GRCHOIX/@CODE[.="grchoix-1"]/CHOIX/@CODE) will find all
GRCHOIX elements that have either 'choix-1' or 'choix-2' as one of their
values.
>
> So the base relation is defined by a function:
>
> <xsl:function name="f:step"> as="element(GRCHOIX)*">
>   <xsl:param name="start" as="element(GRCHOIX)"/>
>   <xsl:sequence select="key('k', $start/CHOIX/@CODE)"/>
> </xsl:function>
>
> Then you need to find the transitive closure of this step.
>
> As it happens we've just three days ago defined a transitive closure
function for inclusion in XPath 4.0, which you can find here:
>
>
https://qt4cg.org/pr/533/xpath-functions-40/Overview.html#func-transitive-clo
sure
>
> Unfortunately, you need it now! You can implement it yourself, but it's not
particularly easy. One way might be like this:
>
> <xsl:iterate select="1 to 100000000">
>    <xsl:param name="result" as="node()*" select="()"/>
>    <xsl:param name="origin" as="node()*" select="$start-node"/>
>    <xsl:variable name="next-nodes" select="($origin ! f:step(.)) except
$result"/>
>    <xsl:choose>
>       <xsl:when test="empty($next-nodes)">
>            <xsl:sequence select="$result"/>
>       </xsl:when>
>       <xsl:otherwise>
>             <xsl:next-iteration>
>                  <xsl:with-param name="result" select="$result |
$next-ndoes"/>
>                  <xsl:with-param name="origin" select="$next-nodes"/>
>            </
>       </
>  </
>
> which basically keeps looking for more nodes satisfying the relation until
no more are found.
>
> This gives you one group starting from one specific $start-node, which you
presumably select at random.
>
> Then repeat, starting from a different $start-node, one that has not already
been allocated to a group; until no more nodes are left.
>
> NOT TESTED!
>
> Michael Kay
> Saxonica
>
>
>
>> grchoix-1
>
>
>
>> On 16 Jun 2023, at 12:09, Matthieu Ricaud-Dussarget ricaudm@xxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>>
>> Hi all,
>>
>> I need to group elements that have at least one common value :
>>
>> <FORMS>
>>   <!--CASE 1-->
>>   <GRCHOIX CODE="grchoix-1">
>>     <CHOIX CODE="choix-1"/>
>>     <CHOIX CODE="choix-2"/>
>>   </GRCHOIX>
>>   <GRCHOIX CODE="grchoix-2">
>>     <CHOIX CODE="choix-1"/>
>>     <CHOIX CODE="choix-2"/>
>>     <CHOIX CODE="choix-3"/>
>>   </GRCHOIX>
>>   <GRCHOIX CODE="grchoix-3">
>>     <CHOIX CODE="choix-2"/>
>>     <CHOIX CODE="choix-3"/>
>>     <CHOIX CODE="choix-4"/>
>>   </GRCHOIX>
>>   <!--CASE 2-->
>>   <GRCHOIX CODE="grchoix-A">
>>     <CHOIX CODE="choix-a"/>
>>     <CHOIX CODE="choix-b"/>
>>   </GRCHOIX>
>>   <GRCHOIX CODE="grchoix-B">
>>     <CHOIX CODE="choix-b"/>
>>     <CHOIX CODE="choix-c"/>
>>   </GRCHOIX>
>>   <GRCHOIX CODE="grchoix-C">
>>     <CHOIX CODE="choix-d"/>
>>     <CHOIX CODE="choix-e"/>
>>   </GRCHOIX>
>>   <GRCHOIX CODE="grchoix-D">
>>     <CHOIX CODE="choix-c"/>
>>     <CHOIX CODE="choix-d"/>
>>   </GRCHOIX>
>> </FORMS>
>>
>> * grchoix-1 and grchoix-2 have 2 common values : choix-1, choix-2 b they
must belong the same group
>> * grchoix-2 and grchoix-3 have 2 common values : choix-2, choix-3 b they
must belong the same group
>> b At the end : grchoix-1, grchoix-2 and grchoix-3 must belong the same
group
>>
>> * grchoix-A and grchoix-B have 1 common value choix-b b they must belong
the same group
>> * grchoix-B and grchoix-C have 0 common values
>> * but as grchoix-D have
>>    - 1 common value (choix-c) with grchoix-B
>>    - 1 one common value (choix-d) with grchoix-C
>>   Then grchoix-B and grchoix-C must also belong the same GROUP
>> b At the end : grchoix-A, grchoix-B, grchoix-C and grchoix-D must belong
the same group
>>
>> The expected XML output is 2 Groups :
>> <FORMS>
>>   <!--CASE 1-->
>>   <GROUP>
>>     <GRCHOIX CODE="grchoix-1">
>>       <CHOIX CODE="choix-1"/>
>>       <CHOIX CODE="choix-2"/>
>>     </GRCHOIX>
>>     <GRCHOIX CODE="grchoix-2">
>>       <CHOIX CODE="choix-1"/>
>>       <CHOIX CODE="choix-2"/>
>>       <CHOIX CODE="choix-3"/>
>>     </GRCHOIX>
>>     <GRCHOIX CODE="grchoix-3">
>>       <CHOIX CODE="choix-2"/>
>>       <CHOIX CODE="choix-3"/>
>>       <CHOIX CODE="choix-4"/>
>>     </GRCHOIX>
>>   </GROUP>
>>   <!--CASE 2-->
>>   <GROUP>
>>     <GRCHOIX CODE="grchoix-A">
>>       <CHOIX CODE="choix-a"/>
>>       <CHOIX CODE="choix-b"/>
>>     </GRCHOIX>
>>     <GRCHOIX CODE="grchoix-B">
>>       <CHOIX CODE="choix-b"/>
>>       <CHOIX CODE="choix-c"/>
>>     </GRCHOIX>
>>     <GRCHOIX CODE="grchoix-C">
>>       <CHOIX CODE="choix-d"/>
>>       <CHOIX CODE="choix-e"/>
>>     </GRCHOIX>
>>     <GRCHOIX CODE="grchoix-D">
>>       <CHOIX CODE="choix-c"/>
>>       <CHOIX CODE="choix-d"/>
>>     </GRCHOIX>
>>   </GROUP>
>> </FORMS>
>>
>> I'm really not sure how to go ahead with such a grouping problem.
>> I've tried something that work on my sample but it's probably too
complicated (maybe it doesnt' work all cases) and most of all it has drastic
bad performances !
>> I'm using fro-each-group on multiple values, wich causes all combination to
be executed and make the internal variables huge (from A 8Mo input I get a 500
Mo size of intermediate step)
>> At this point I get without surprise a saxon internal error on my real
(rather big) XML input.
>>
>> See my (so) bad solution in a next message (I get a message size to long
when I copy it here)
>>
>> I'm pretty sure there's something quite more elegant to do, did you ever
had to do such a grouping ?
>> BTW I'm using XSLT 3.
>>
>> Any suggestion/help appreciated
>>
>>  Cheers,
>>
>> Matthieu Ricaud-Dussarget
>>
>>
>>
>> XSL-List info and archive
>> EasyUnsubscribe (by email)
>
> XSL-List info and archive
> EasyUnsubscribe (by email)

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.