|
[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: exercise in complex grouping
I have assumed there are no complicated overlapping cases here but this
works for the one case I tried which includes A before B and B before A
<x>
blah blah blah
<d><e>blah</e> blah
<B target="#A1">blort</B>
<f>monkey</f> shines
<A xml:id="A1">snort</A>
blah
<A xml:id="A2">snort</A>
<q/>
<l>zzz</l>
<B target="#A2">blort</B>
<kkkk/>
</d>
zzz
</x>
----
<xsl:stylesheet version="2.0" xmlns:xsl="
http://www.w3.org/1999/XSL/Transform">
<xsl:template match="node()">
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
<xsl:key name="b" match="B" use="substring(@target,2)"/>
<xsl:template match="d">
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:for-each-group select="node()" group-adjacent="self::B or
self::A[key('b',@xml:id)]">
<xsl:choose>
<xsl:when test="current-grouping-key()">
</xsl:when>
<xsl:otherwise>
<xsl:variable name="a" select="preceding-sibling::*[1]"/>
<xsl:variable name="b"
select="current-group()[last()]/following-sibling::*[1]"/>
<xsl:choose>
<xsl:when test="concat('#',$a/@xml:id)=$b/@target or
concat('#',$b/@xml:id)=$a/@target">
<xsl:text> </xsl:text><C><xsl:text> </xsl:text>
<xsl:copy-of select="$a"/>
<xsl:text> </xsl:text>
<xsl:copy-of select="current-group()"/>
<xsl:text> </xsl:text>
<xsl:copy-of select="$b"/>
<xsl:text> </xsl:text></C><xsl:text> </xsl:text>
</xsl:when>
<xsl:otherwise>
<xsl:copy-of select="current-group()"/>
</xsl:otherwise>
</xsl:choose>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
---
produces
<x>
blah blah blah
<d><e>blah</e> blah
<C>
<B target="#A1">blort</B>
<f>monkey</f> shines
<A xml:id="A1">snort</A>
</C>
blah
<C>
<A xml:id="A2">snort</A>
<q/>
<l>zzz</l>
<B target="#A2">blort</B>
</C>
<kkkk/>
</d>
zzz
</x>
On Tue, 12 May 2020 at 10:33, Syd Bauman s.bauman@xxxxxxxxxxxxxxxx <
xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
> I have a moderately sizable TEI file (~31,000 text nodes with ~100,400
> "words" or ~688,000 characters; ~20,000 elements, ~15,000 attributes).
> Somewhere in all that mess there are a few pairs of elements for which
> I need some special processing.
>
> Say each pair is an <A> and a <B>. I can find each <B> by XPath quite
> trivially. In addition, for every pair, <B> has a @target that points
> to the corresponding <A> via a bare name identifier URL. Furthermore,
> every <B> in the document is part of such a pair. (Which is why it is
> so trivial to find them via XPath. The same can not be said for <A>:
> there are *lots* of <A> elements that are not part of an <A>-<B>
> pair; but none, of course, that bear that particular @xml:id, so they
> can be found by XPath. It's just easy, not trivial. :-)
>
> In general, there can be other nodes between <A> and <B>, and there
> will be cases in which <B> precedes rather than follows the <A> it
> points to. E.g.,
>
> blah blah blah
> <d><e>blah</e> blah
> <B target="#A1">blort</B>
> <f>monkey</f> shines
> <A xml:id="A1">snort</A>
> blah</d>
>
> I want to be able to handle these cases, too.
>
> For the foreseeable future, there will never be another <B> in between
> a <B> and the <A> it points to, and each <B> will be a child of the
> same element as the <A> it points to. (I.e., no overlap problems.) But
> as soon as I say these complications will never happen, the very next
> day the editors will gleeful send e-mail saying they have found such a
> case. But for now, if needed, I'm willing to write code that presumes
> it won't happen.
>
> What I want for output is to be able to wrap the <B> with the <A> it
> points to, *and everything in between* in a <C>.
>
> blah blah blah
> <d><e>blah</e> blah
> <C xml:id="A1Container">
> <B target="#A1">blort</B>
> <f>monkey</f> shines
> <A xml:id="A1">snort</A>
> </C>
> blah</d>
>
> I am 90% confident I can write some messy XSLT 1.0 Muenchian grouping
> code that does this. (Although I suspect it would take two passes,
> one for <A> precedes <B>, another for <B> precedes <A>; but I don't
> care about two passes at all, and would not even care if it took N
> passes.[1]) But I am equally confident there is a much better
> <xsl:for-each-group> method that, at the moment, I simply can't wrap
> my head around.
>
> Thanks for any thoughts, pointers, code, or advice.
>
> Note
> ----
> [1] Where N is proportional to the number <A>-<B> pairs.
>
> --
> Syd Bauman, NRP (he/him/his)
> Senior XML Programmer/Analyst
> Northeastern University Women Writers Project
> s.bauman@xxxxxxxxxxxxxxxx or
> Syd_Bauman@xxxxxxxxxxxxxxxx
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|

Cart








