[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Grouping text nodes

Subject: RE: Grouping text nodes
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Wed, 3 Aug 2005 11:56:59 +0100
input id nocte
In XSLT 1.0 I would tackle this using the technique that I've started
referring to as "sibling recursion". The general pattern is:

(a) From the parent element do

   <xsl:apply-templates select="child::node()[1]" mode="sibling-recursion"/>

(b) Write one or more templates that match the child elements; the structure
of these is:

<xsl:template match="xxx" mode="sibling-recursion">
   ... process this node ...
   <xsl:apply-templates select="following-sibling::node()[1]"
mode="sibling-recursion">
      ... with-params ...
   </xsl:apply-templates>
</xsl:template>

In 2.0 converting "text<br/>" to "<line>text</line>" is often conveniently
done using group-ending-with="br".


This doesn't by itself help with your problem of handling the irregularities
in your input data. I think that when you have such irregularities, it's
often best to write a multiphase transformation in which each phase tries to
make the structure a bit more regular, making it easier for subsequent
phases to do their work.

But I'm afraid these are only rough ideas - I don't have time to get
immersed in the detail of what looks quite a challenging problem.

Michael Kay
http://www.saxonica.com/


> -----Original Message-----
> From: James Cummings [mailto:cummings.james@xxxxxxxxx] 
> Sent: 03 August 2005 10:49
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject:  Grouping text nodes
> 
> Hi there,
> 
> I have some XHTML I'm trying to transform to add more structure to it.
>  It is a copy of the Latin Vulgate Bible.  Currently the XHTML looks
> something like this:
> -----
> <div class="chapter">
> <span class="chapter-num">1</span>
>         <div class="poetrystartchapter">
>                     <span class="vn" 
> id="x1_1">1</span>&nbsp;Beatus vir qui
>                     non abiit in consilio impiorum,<br/> et 
> in via peccatorum
>                     non stetit,<br/> et in cathedra pestilenti&aelig;
> non sedit&nbsp;;<br/>
>                     <span class="vn" 
> id="x1_2">2</span>&nbsp;sed in lege
>                     Domini voluntas ejus,<br/> et in lege 
> ejus meditabitur die
>                     ac nocte.<br/>
>                     <span class="vn" 
> id="x1_3">3</span>&nbsp;Et erit tamquam
>                     lignum quod plantatum est secus decursus 
> aquarum,<br/> quod
>                     fructum suum dabit in tempore 
> suo&nbsp;:<br/> et folium
>                     ejus non defluet&nbsp;;<br/> et omnia
>                     qu&aelig;cumque faciet prosperabuntur.<br/>
> ...</div>...</div>
> -----
> What I want to get is something like:
> -----
> <div type="chapter" n="1">
>              <milestone type="poetrystartchapter"/>
>              <lg xml:id="x1_1" n="1">
>                     <l xml:id="x1_1-1">Beatus vir qui
>                     non abiit in consilio impiorum,</l>
>                    <l xml:id="x1_1-2">et in via peccatorum 
> non stetit,</l>
>                     <l xml:id="x1_1-3">et in cathedra
> pestilenti&aelig; non sedit </l>
>               </lg>
>               <lg xml:id="x1_2" n="2">
>                     <l xml:id="x1_2-1"> sed in lege Domini 
> voluntas ejus,</l>
>                     <l xml:id="x1_2-2">et in lege ejus meditabitur die
> ac nocte.</l>
>                </lg>
>                 <lg xml:id="x1_3">
>                      <l xml:id="x1_3-1"> Et erit tamquam
>                     lignum quod plantatum est secus decursus 
> aquarum,</l>
>                     <l xml:id="x1_3-2"> quod fructum suum dabit in
> tempore suo :</l>
>                     <l xml:id="x1_3-3"> et folium ejus non 
> defluet;</l>
>                     <l xml:id="x1_3-4"> et omnia qu&aelig;cumque
> faciet prosperabuntur.</l>
>                      </lg>
> <milestone type="EndOfpoetrystartchapter"/>
> ...</div>
> -----
> My problem is when I'm looking backwards to create the @xml:id for
> each of the lines whilst grouping the text nodes into lines. 
> Sometimes there is extra existing structure which seems to get in the
> way, where the <div> (if present at all) starts after the first line
> 
> -----
>  <div class="chapter"><span class="chapter-num">118</span>
>                 <span class="vn" id="x118_1">1</span>&nbsp;Alleluja. 
>                     <div class="poetry"><span
> class="speaker">Aleph.</span> Beati
>                     immaculati in via,<br/> qui ambulant in 
> lege Domini.<br/>
>                     <span class="vn" 
> id="x118_2">2</span>&nbsp;Beati qui
>                     scrutantur testimonia ejus&nbsp;;<br/> in 
> toto corde
>                     exquirunt eum.<br/>
> -----
> Which is supposed to  come out something likelike:
> -----
>  <div type="chapter" n="118">
>                 <lg xml:id="x118_1" n="1">
>                      <l xml:id="x118_1-1">Alleluja.
>                       <milestone type="poetry"/>
>                     <seg type="speaker">Aleph.</seg> Beati immaculati
> in via,</l>
>                      <l xml:id="x118_1-2"> qui ambulant in 
> lege Domini.</l>
>                  </lg>
>                   <lg>
>                      <l xml:id="x118_2-1"> Beati qui scrutantur
> testimonia ejus; </l>
>                      <l xml:id="x118_2-2"> in toto corde  
> exquirunt eum.</l>
>                   </lg>
>                    <milestone type="Endofpoetry"/>
> ... </div>
> -----
> At the moment when matching  text() to create the lines, I then look
> back (preceding:: or preceding-sibling:: ) to the span grab the
> span/@id to create the l/@xml:id... but in instances like psalm 118
> where another div or span gets in the way it tends to muck up.
> 
> So I'm convinced there is probably an entirely better way to do this. 
> Any suggestions?
> 
> Many Thanks,
> -James
> 
> -- 
> James Cummings, Cummings dot James at GMail dot com

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.