[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: RSS feeds and disable-output-escaping="yes"
> It's likely that the HTML isn't well-formed XML, so you're going to have to > extract it as a string, put it through the tidy utility, parse it, and get > it back into the stylesheet in tree form before you can manipulate it at the > node level. > > I would tend to do this as a non-XSLT stage in a processing pipeline; you > could also do it by calling out to an extension function. > Of course Michael is probably still using XSLT1. Some of us have moved up to XSLT2 (There's a nice implementation called saxon8...) in which case you can handle a fair amount of "non well formed html as a string" just using XSLT2 functions. eg h.xml: <greeting><![CDATA[<P>Hello, <i>world!</P>]]></greeting> h.xsl: <?xml version="1.0" encoding="iso-8859-1"?> <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:d="data:,dpc" exclude-result-prefixes="d"> <xsl:import href="http://www.dcarlisle.demon.co.uk/htmlparse.xsl"/> <xsl:output method="html"/> <xsl:template match="/"> <html> <head> <title>Today's greeting</title> </head> <body> <xsl:copy-of select="d:htmlparse(string(greeting[1]),'',true())/node()"/> </body> </html> </xsl:template> </xsl:stylesheet> $ saxon8 h.xml h.xsl <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <title>Today's greeting</title> </head> <body> <p>Hello, <i>world!</i></p><i></i></body> </html> The <i></i> there is an artifact of its html "recovery" mode of re-opening automatically closed elements (looks like I should improve that a bit one day), you can turn off that so by changing true() in the above call to false() then you get $ saxon8 h.xml h.xsl <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <title>Today's greeting</title> </head> <body> <P>Hello, <i>world!</i></P> </body> </html> so now the <i> element has been closed but no lowercasing or other html-specific transformations have been done, and <i> isn't re-opened. David ________________________________________________________________________ This e-mail has been scanned for all viruses by Star. The service is powered by MessageLabs. For more information on a proactive anti-virus service working around the clock, around the globe, visit: http://www.star.net.uk ________________________________________________________________________
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|