[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Problems with mixed content and inline elements when

Subject: Problems with mixed content and inline elements when transforming XHTML into another XML format
From: Tony Kinnis <kinnist@xxxxxxxxx>
Date: Wed, 22 Feb 2006 14:28:53 -0800 (PST)
what is kinnis
Hello all,

I have been trying to solve this problem for a few days now and I have
had no luck. I am hoping someone here can help me out with this.

I need to parse XHTML and transform it into another XML format. I am
sure that the XHTML is valid and well formed (I am running it through
HTMLTidy). The first problem I encountered was the notion of mixed
elements. Something like...

<div>
     My name is <b>bob</>. What is yours?
    <ul>
         <li>foo</li>
         <li>bar</li>
    </ul>
</div>

I found a utility script on the web that can turn mixed content into
element content. I am guessing some of you have seen this script
before.

<xsl:template match="text()[normalize-space(.)][../*]">        
        <xsl:element name="textnode">
            <xsl:value-of select="."/>
        </xsl:element>
    </xsl:template>
    
    <xsl:template match="@*|node()">   
        <xsl:copy>            
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

This makes the above post look like...

<div>
     <textnode>My name is </textnode><b>bob</><textnode>. What is
yours?</textnode>
    <ul>
         <li>foo</li>
         <li>bar</li>
    </ul>
</div>

However, what I would really like to do is have the bold tags included
inside of the textnode tag so that it looks like...

<div>
     <textnode>My name is <b>bob</>. What is yours?</textnode>
    <ul>
         <li>foo</li>
         <li>bar</li>
    </ul>
</div>

In other words I would like to treat the <b> element as text and not an
element. There is a finite set of tags I would like to be treated as
simple text. These are considered in-line elements in html.
<b><i><em><strong><u>

An alternative, and better solution, would be wrapping all text through
the document in the textnode element including the in-line elmements
mentioned above. The  xml I will finally output from the transformation
of the xhtml requires all text be wrapped in a special displaytext tag
including the in-line elements mentioned above. By placing every piece
of text, including the in-line text tags above, in a textnode I could
easily pass the document through another template that says...

   <xsl:template match="textnode[normalize-space(.)]">
        <xsl:element name="displaytext">
            <xsl:apply-templates/>
        </xsl:element>
    </xsl:template> 

This would make things much easier.

Below are the xsl processor and xsl version. I am not tied to Saxon if
another processor could do the job, provided it can be used within Java
and ports across platforms (windows, unix, etc).

Processor: Saxon8B
XSL Version: 2.0

Thanks in advance for your help.

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.