|
[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Selective escaping of special characters
[Kyrre Wathne] > My apologies if this question has been asked before, I haven't found posts > that address this exact issue. > > My problem is that I want to transform junk HTML generated by Microsoft > Word. This contains markup, of course, so my first instinct was to use > disable-output-escaping. However, this also disables escaping of other > special characters, like the special dash character –. These are then > outputted in a format my browser (Internet Explorer) doesn't understand (I > use "ISO-8859-1" as encoding in output). > Not exactly what you asked for, but HTML-Tidy has a setting that causes it to remove all the Microsoft junk from Word2000 output. There are java and C versions, with various wrappers including Python. One fast preprocessing pass with Tidy will do a really nice job of getting rid of all that noise, much easier than trying to get a stylesheet working. Cheers, Tom P XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|

Cart








