[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Combining consecutive siblings

Subject: Combining consecutive siblings
From: David Sewell <dsewell@xxxxxxxxxxxx>
Date: Wed, 29 Jul 2009 21:56:49 -0400 (EDT)
 Combining consecutive siblings
I'm trying to post-process the HTML produced via Adobe Acrobat's PDF export. (Actually, XHTML via Tidy from Acrobat's HTML 4.01.) Acrobat does something very funky with end-of-line hyphens that it deems "soft", namely wrapping the preceding and following text nodes inside a styled <span> and removing the hyphen. To simplify the situation, if the input text was

The volumes of the Docu-
mentary History of the Rati-
fication of the Consitution are heavy.

the output would be something like

<p>The volumes of the <i>Docu</i><i>mentary
History of the Rati</i><i>cation of the Constitution</i>
are heavy.</p>

Now there are various reasons why it would be nice to transform these constructs so that all consecutive <i> elements are wrapped in a single element. I've come up with the following XSLT 2.0 templates that rely on the '>>' operator to group consecutive sibling <i>'s for processing. It works on some sample data, but it is a risky transform because if the logic is not perfect, there could be dropped <i>'s. Can anyone see a potential case where this would fail?

   <xsl:template match="i">
      <xsl:choose>
         <xsl:when test="preceding-sibling::node()[1][self::i]">
            <!-- omit, the next when-clause handles me -->
         </xsl:when>
         <xsl:when test="following-sibling::node()[1][self::i]">
            <xsl:variable name="stopNode"
               select="following-sibling::node()[not(self::i)][1]"/>
            <xsl:copy>
               <xsl:apply-templates/>
               <xsl:apply-templates
                  select="following-sibling::i[not(. &gt;&gt; $stopNode)]"
                  mode="copy"/>
            </xsl:copy>
         </xsl:when>
         <xsl:otherwise>
            <xsl:copy><xsl:apply-templates/></xsl:copy>
         </xsl:otherwise>
      </xsl:choose>

   </xsl:template>
   <xsl:template match="i" mode="copy">
      <xsl:apply-templates/>
   </xsl:template>

DS

--
David Sewell, Editorial and Technical Manager
ROTUNDA, The University of Virginia Press
PO Box 801079, Charlottesville, VA 22904-4318 USA
Courier: 310 Old Ivy Way, Suite 302, Charlottesville VA 22903
Email: dsewell@xxxxxxxxxxxx   Tel: +1 434 924 9973
Web: http://rotunda.upress.virginia.edu/

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2007 All Rights Reserved.