|
[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: Converting non-pure trees to pure trees
> I have a XML file which I have automatically converted from
> msword, the basic structure is:
>
> <worddocument>
> <p>paragraph <b>hello</b> <i>world</i></p>
> <p>paragraph <b>hello</b> <i>world</i></p>
> <p>paragraph <b>hello</b> <i>world</i></p>
> <pagebreak/>
> <p>2/1</p>
> <p>paragraph <b>hello</b> <i>world</i></p>
> <p>paragraph <b>hello</b> <i>world</i></p>
> <p>paragraph <b>hello</b> <i>world</i></p>
> <pagebreak/>
> <p>2/2</p>
> <p>paragraph <b>hello</b> <i>world</i></p>
> <p>paragraph <b>hello</b> <i>world</i></p>
> <p>paragraph <b>hello</b> <i>world</i></p>
> <worddocument/>
This is a grouping problem, of the kind I call "grouping by position".
Grouping problems in XSLT are not easy: for background, see
www.jenitennison.com.
All grouping problems require two nested loops. The outer loop selects a
representative element for each group, which in this case seems to be a <p>
element that is immediately preceded by a <pagebreak> element:
<xsl:for-each select="p[preceding-sibling::*[1][self::pagebreak]">
<mongraph id="{.}">
...
</mongraph>
</xsl:for-each>
Inside this you need an inner loop that processes all the elements within
one group. In this case these are "all the <p> elements that follow the
"representative" element, up to the next "representative" element. Or to put
it another way, all following <p> elements whose first preceding
<page-break> is the same as the first preceding <page-break> of the current
element.
So the inner loop can be:
<xsl:for-each select="following-sibling::p[
generate-id(preceding-sibling::page-break[1]) =
generate-id(current()/preceding-sibling::page-break[1])]"
<xsl:copy-of select="."/>
</xsl:for-each>
In Saxon there is a simpler solution using the saxon:leading() extension
function.
Mike Kay
>
> I wish to transform this tree using some knowledge I have
> about the document:
> The first page is always the "introduction", whilst all
> sebsequent pages are "monographs"
>
> <semanticdocument>
> <introduction>
> <p>paragraph <b>hello</b> <i>world</i></p>
> <p>paragraph <b>hello</b> <i>world</i></p>
> <p>paragraph <b>hello</b> <i>world</i></p>
> </introduction>
> <mongraphs>
> <mongraph id="2/1">
> <p>paragraph <b>hello</b> <i>world</i></p>
> <p>paragraph <b>hello</b> <i>world</i></p>
> <p>paragraph <b>hello</b> <i>world</i></p>
> </mongraph id="2/1">
> <mongraph id="2/2">
> <p>paragraph <b>hello</b> <i>world</i></p>
> <p>paragraph <b>hello</b> <i>world</i></p>
> <p>paragraph <b>hello</b> <i>world</i></p>
> </mongraph>
> </mongraphs>
> <semanticdocument/>
>
>
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|

Cart








