[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Handling Non Well conformed HTML content

Subject: RE: Handling Non Well conformed HTML content
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Tue, 3 Oct 2006 14:12:02 +0100
RE:  Handling Non Well conformed HTML content
> I have typical issue in handling HTML content in XML document 
> of the below structure and i want to replace the HTML 
> template with the respective node element text.
> HTML is not well formed. 

Before you can process the HTML, you will have to turn it into well-formed
XML. You can do this using the JTidy utility.

For that matter we are doing base64 
> encode of the html content.

You'll have to find a Base64 decoder. Details will depend on your processing
environment, e.g. whether it's Java, Microsoft, or whatever.

However, I can't relate either of those points to the example you show
below.

> Please provide any resolution for the same.
> The replacement content might be in any part of the document.
> Any suggestions are welcome.
> 
> Input content
> <?xml version="1.0" encoding="UTF-8"?>
> <broadcast>
>   <content_vars>
>    <content name="subject"><html>Hello [[BUYERS_NAME]]</html></ 
> content><!--encoded-->
>    <content name="text">REF Order [WEB_ORDER_NUMBER]</content><!-- 
> encoded->
>   </content_vars>
> 
>     	<ORDER_FEED>
> <ORDER>
> <ORDER_HEADER>
> <BUYERS_NAME>Senthil</BUYERS_NAME>
> <WEB_ORDER_NUMBER>W12345<WEB_ORDER_NUMBER>
> </ORDER_HEADER>
> <!--Line Items-->
> </ORDER>
> </ORDER_FEED>
> </broadcast>
> 
> XSLT I tried for the same
> <?xml version="1.0" encoding="UTF-8"?>
> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/
> Transform">
> 
> <xsl:output method="html" indent="yes" />
> 
> <xsl:template match="/broadcast">
>         <xsl:apply-templates select="content_vars/content" />
> 
> </xsl:template>
> 
> <xsl:template match="content">
> 
>      <xsl:variable name="temp1" select="translate(., '[]', '')" />
>      <xsl:variable name="temp2"
> 
> select="normalize-space(../following-sibling::*[contains($temp1,
> local-name())])" />
>      <xsl:variable name="temp3"
> select="local-name(../following-sibling::*[contains($temp1,
> local-name())])" />
>      <xsl:value-of select="substring-before($temp1, $temp3)"
> /><xsl:value-of select="$temp2" /><xsl:value-of 
> select="substring-after($temp1, $temp3)" /> </xsl:template>
> 
> </xsl:stylesheet>
> 
> Expected output
> <html>
> Hello Senthil
> REF Order W12345
> </html>
> 
> And I am getting unexpected
> <html>
> Hello BUYERS_NAME
> REF Order WEB_ORDER_NUMBER
> </html>
> Let me know how do I tweak the code to work as desired.

I think it's more than a tweak. Your main mistake is using the
following-sibling axis rather than following (the BUYERS_NAME element is not
a sibling of the content_vars element). But also, your code seems generally
lacking in robustness. You're ignoring both the HTML tagging and the [[...]]
markers (or [...] depending which of the two examples we look at); you're
assuming that there will only be one insert in each element, and that its
name won't clash with any other textual content in the element. This all
seems pretty poor coding.

Michael Kay
http://www.saxonica.com/

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.