[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Splitting text nodes - xsl:iterate?

Subject: Splitting text nodes - xsl:iterate?
From: "Tom Cleghorn tcleghorn@xxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 12 Nov 2014 14:10:22 -0000
 Splitting text nodes - xsl:iterate?
Hi list,

Given an input document looking something like this:
<doc>
  <head><foo/><bar/><baz/></head>
  <body>
    <sec>
      <para>Lorem ipsum dolor sit amet, consectetur adipiscing elit.<box 
outline="maybe"><para quack="y">Proin id <?foo bar?>bibendum urna, <baz>ut 
ornare</baz> mi.</para></box></para>
      <para>Aenean dui risus, <qux>sodales quis leo sit amet, ornare 
consequat</qux> metus. Ut vel massa congue, egestas nibh et, rutrum 
odio.</para>
    </sec>
  </body>
</doc>

(i.e. document markup consisting of arbitrary text and element nodes 
nested to some unknown depth)

and the requirement for two separate outputs looking like these:
<doc>
  <head><foo/><bar/><baz/></head>
  <body>
    <sec>
      <para><new:start/>Lorem ipsum dolor sit amet, consectetur adipiscing 
elit.<box outline="maybe"><para quack="y">Proin id <?foo bar?>bibendum 
urna, <baz>ut ornare</baz> mi.</para></box></para>
      <para>Aenean dui risus, <qux>sodales quis <new:end/>leo sit amet, 
ornare consequat</qux> metus. Ut vel massa congue, egestas nibh et, rutrum 
odio.</para>
    </sec>
  </body>
</doc>

<sec>
  <para>Lorem ipsum dolor sit amet, consectetur adipiscing elit.<box 
outline="maybe"><para quack="y">Proin id <?foo bar?>bibendum urna, <baz>ut 
ornare</baz> mi.</para></box></para>
  <para>Aenean dui risus, <qux>sodales quis [...]</qux></para>
</sec>

(i.e. a copy of the input, with new:start and new:end elements marking the 
first 20 words of the document; and separately a copy of those first 
twenty words, preserving all markup within them and adding ellipses at the 
end)

...how might I fruitfully approach the transformation in an XSLT idiom? I 
feel that there should be some neat declarative way of doing it, possibly 
using xsl:iterate and/or accumulators, that I'm just failing to see. XSLT 
3.0 is available (Saxon 9.6), but the source documents are old content and 
not open to adjustment, sadly. I've tried using xsl:iterate, but I seem to 
be falling down in keeping track of whether or not I'm processing the 
specific text node in which the break needs to occur.

Am I making a rod for my own back here? Should I just be breaking out to a 
custom Java function and crossing my fingers that I manage to avoid 
ill-formed output? Any advice will be very gratefully received!

Thanks!

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.