[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Joining list fragments

Subject: Re: Joining list fragments
From: "Michael Müller-Hillebrand mmh@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sat, 9 May 2020 19:07:15 -0000
Re:  Joining list fragments
Folks,

That piece of code by Gerrit is beautiful:
https://github.com/gimsieke/join-list-fragments

My personal original strategy was to pick the beginning of a broken list and
then walk the following-sibling axis to collect all "joinable" list fragments.
As Gerrit mentioned early on

> This looks like a nested group-starting-with / group-adjacent to me at first
glance.


I was willing to listen. But I could never have come up with this beauty.

In a private mail Gerrit confessed his mantra: B;Everything that looks
remotely like grouping has to be grouped!B+ What a fitting motto for the King
of Grouping! (Should have that in Latin, though. Or some native speaker comes
up with better phrasing, please.)

The main template uses group-starting-with to catch the broken lists, and a
nested group-adjacent to select the "joinable" parts from each
current-group(). BTW, has anyone seen innermost() or outermost() used before?

The main part is in template "collect", which uses group-starting-with, a
nested group-adjacent, and another group-starting-with with recursion for more
list levels. And this is the most beautiful solution, which I could never have
imagined: There is exactly one XPath expression using element and attribute
names from the source document. The rest is logic and evaluating list item
levels. Wow!

Thank you very much,

- Michael MH


> Am 06.05.2020 um 08:26 schrieb Imsieke, Gerrit, le-tex
gerrit.imsieke@xxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>:
>
> Ok, it turned out that some recursion is necessary.
>
> Michael (MC<ller-Hillebrand) sent me an updated test file and the expected
results. As one can expect, the problem is even more complex than Michael's
initial sample input suggests, due to the merging on multiple levels that is
required.
>
> But today I take pride in saying that the self-declared king of grouping (I)
was able to solve it!
>
> https://github.com/gimsieke/join-list-fragments
>
> The solution is remotely similar to what I presented about "upward
projection" at XML Prague 2019
(https://subversion.le-tex.de/common/presentations/2019-02-09_xmlprague_xslt-
upward-projection/slides/) in that leaf nodes are grouped and the surrounding
subtree is later reconstructed.
>
> If you run the example (apply xsl/join-list-fragments to
test/sample_html.xml in #default mode), you will notice that a file
debug1_atomic-items.xml is created. This is a somewhat flattened input that I
looked at intensely and that I gradually modified when I set up the grouping.
I can't stress enough how much looking at this semi-flattened file and the
ad-hoc attributes that I created informed the evolution of the grouping.
Without this debugging output, it would have been too complex to understand
what is going on and what should happen in the recursive grouping.
>
> The debugging output has the following additional attributes:
>
> list-level: 0 for uninteresting elements, absent attribute for elements that
need to be collected with the preceding list item, any other positive value
indicates the nesting depth at which a new list item will be created for the
group starting at that element
>
> start: 'true' for an element that will become the first item of a (re-)
created top-level ol element
>
> start-level: the depth at which a re-created ol element will be created (2
indicates an ol/li/ol). This attribute is not used for top-level lists, where
@start is used.
>
> It may be that an additional recursion is necessary if there is more
variation than start-level="2". Maybe MMH can create more input that also
contains such a case, but it might well be that it isn't relevant fpr their
problem.
>
> I might eventually add more documentation to the XSLT. At this stage, even
with what I wrote above, it's a bit obscure -- write-only code -- which often
is the case for recursive grouping. Running it in oXygen debugger with
appropriate breakpoints and with inspecting current-group() might further
illustrate how it works.
>
> Gerrit
>

>>
>> We want to join list fragments and some content in between them. An
HTML-ish version of the input looks like this:
>>
>> <div>
>> <h2id="E2">Item with content to be joined follows div to collect</h2>
>> <div>
>> <oldata-meta="listlevel=start">
>> <li>
>> <p>1st item</p>
>> </li>
>> </ol>
>> <divclass="box"data-meta="collect">
>> <p>Hint</p>
>> </div>
>> <oldata-meta="listlevel=continue">
>> <lidata-meta="listitem=continue">
>> <p>Para ff</p>
>> </li>
>> <li>
>> <p>2nd item</p>
>> </li>
>> </ol>
>> <p>Other arbitrary content</p>
>> </div>
>> </div>
>>
>> Every broken list sequence starts with data-meta="listlevel=start" and a
list or a list item that is supposed to be joined with the start list is
marked using data-meta="listlevel=continue" and data-meta="listitem=continue".
There can be any number of collect items between lists and multiple continue
lists, but it is guaranteed that whatever needs to be collected will end with
a list. In DTD content model notation: startList, (collectItem*,
continueList)+
>>
>> The lists are not limited to a single level. Gladly, if there is a
"listitem=continue" in a continue list, it is guaranteed to be at the same
level the previous list ends.
>>
>> The task is to add to the last item of the previous list:
>> * all content marked "collect" between the lists; other content would break
the process
>> * content of the next listbs first list item if marked
"listitem=continue"
>> The remaining content of each continue list would be added as additional
items to the start list.
>>
>> The desired result for the input data above would look like this:
>>
>> <div>
>> <h2id="E2">Item with content to be joined follows div to collect</h2>
>> <div>
>> <oldata-meta="listlevel=start">
>> <li>
>> <p>1st item</p>
>> <divclass="box"data-meta="collect">
>> <p>Hint</p>
>> </div>
>> <p>Para ff</p>
>> </li>
>> <li>
>> <p>2nd item</p>
>> </li>
>> </ol>
>> <p>Other arbitrary content</p>
>> </div>
>> </div>

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.