[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: illegal elements must go...

Subject: Re: illegal elements must go...
From: Jeni Tennison <mail@xxxxxxxxxxxxxxxx>
Date: Fri, 25 Aug 2000 12:53:16 +0100
jibii
Jukka,

>I'm having parent element which can have many childrens (in source). But the
>result side (in DTD) there are fewer possible elements. So, solution I'm
>gonna do is to put those 'illegal' (in result side) elements childrens of
>para element(s).

You can think of this as a grouping problem and apply a grouping solution
to it.  You're grouping all the illegal elements and text within a 'para'
element.  As with all grouping problems, you have to ask yourself what is
unique about these particular nodes that puts them in a group with each other?

The answer in cases like this is the identity of some following node.  In
your example:

<entry>
  <para>sometext</para>
  <jibii>sometext</jibiii>
  sometext without tags - illegal
  <zzz>illegal element in result</zzz>
  <xxx>another illegal elem</xxx>
  <jibii>this elem is good</jibii>
  <xxx>another illegal</xxx>
</entry>

'sometext without tags - illegal' and the following 'zzz' and 'xxx'
elements all have the same preceding legal element
(<jibii>sometext</jibiii>) and the same following legal element
(<jibii>this elem is good</jibii>).  So you can use this fact to group the
nodes together.

As usual I'll use the Muenchian Method and define a key:

<xsl:key name="illegal-nodes"
         match="xxx | zzz | entry/text()[normalize-space(.)]"
         use="generate-id(following-sibling::*[name() = 'para' or
                                               name() = 'jibii'])" />

The key matches on the illegal nodes that you know about - change this
expression to match any illegal nodes - note that I've selected only that
text that actually has some non-whitespace content.  The key uses as a
value the unique id of the first legal element (a 'para' or a 'jibii') that
follows the matched illegal node.  You could put something more complex
there in order to match other legal nodes.

Thus, within a template that matches on a legal element, you can use:

  <xsl:variable name="preceding-illegal-nodes"
                select="key('illegal-nodes', generate-id())" />
  <xsl:if test="$preceding-illegal-nodes">
    <para><xsl:copy-of select="$preceding-illegal-nodes" /></para>
  </xsl:if>

The variable $preceding-illegal-nodes holds the illegal nodes that precede
the current legal element, identifying them through it's unique identifier.
 If there are such nodes, a copy of them is placed within a 'para' element.

You also need to make sure to copy any illegal nodes that come at the end
of the entry, so within the 'entry'-matching template similarly have:

    <xsl:variable name="ending-illegal-nodes"
                  select="key('illegal-nodes', '')" />
    <xsl:if test="$ending-illegal-nodes">
      <para><xsl:copy-of select="$ending-illegal-nodes" /></para>
    </xsl:if>

The key value of '' gets all those nodes that were given a key value that
was the result of calling generate-id() on an empty node set.

With those in place, you just need to be sure that you're only applying
templates to the legal nodes.  The final stylesheet is:

----
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0">

<xsl:key name="illegal-nodes"
         match="xxx | zzz | entry/text()[normalize-space(.)]"
         use="generate-id(following-sibling::*[name() = 'para' or
                                               name() = 'jibii'])" />

<xsl:template match="entry">
  <entry>
    <xsl:apply-templates select="para | jibii" />
    <xsl:variable name="ending-illegal-nodes"
                  select="key('illegal-nodes', '')" />
    <xsl:if test="$ending-illegal-nodes">
      <para><xsl:copy-of select="$ending-illegal-nodes" /></para>
    </xsl:if>
  </entry>
</xsl:template>

<xsl:template match="*">
  <xsl:variable name="preceding-illegal-nodes"
                select="key('illegal-nodes', generate-id())" />
  <xsl:if test="$preceding-illegal-nodes">
    <para><xsl:copy-of select="$preceding-illegal-nodes" /></para>
  </xsl:if>
  <xsl:copy-of select="."/>
</xsl:template>

</xsl:stylesheet>
----

This works in SAXON.  It doesn't work in Xalan, which ignores the illegal
nodes that occur at the end of the 'entry': either Xalan doesn't produce an
empty string when generate-id() is called on an empty node set, or it
doesn't like having key values that are empty strings.

I think that this solution sits fairly well alongside the other solutions
that were proposed.

Cheers,

Jeni



 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.