|
[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: resolve html entities
I would suggest parsing the HTML using John Cowan's TagSoup parser. This looks to the XSLT processor just like an XML parser, so you can probably integrate it directly - depending on the XSLT processor that you are using. Michael Kay http://www.saxonica.com/ > -----Original Message----- > From: Maximilian Gdrber [mailto:max@xxxxxxxxxx] > Sent: 31 October 2005 08:40 > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx > Subject: resolve html entities > > Hi, > > I know this is a common question but I could not find a > specific answer > to this: > > I am exporting texts from a database that contains html markup. Now I > need to transform > the html to something usable in a DTP application. > > The tags are not the problem because I am only allowing a > subset of html > but the html entities > (german umlauts, special characters) would need to be transformed to > plain Unicode (UTF-8) > characters. > > What is the best way to achieve this? > > Thanks, > > Max Gaerber
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|

Cart








