[Home] [By Thread] [By Date] [Recent Entries]


At 09:50 AM 8/6/2002 -0700, Dare Obasanjo wrote:
>Instead of tweaking tidy or Xerces-C why not just perform a simple search 
>and replace by hand or programmatically (*cough* Perl *cough*).

I wrote a hack in Java that takes care of the nasty '<![if' as well as a 
few other cases peculiar to Office HTML output:

http://simonstl.com/projects/O2KCleaner/

It's meant to be hooked up to the input on a SAX parser, and has done 
pretty well on the cases I've fed it, but I can't begin to promise that the 
format hasn't evolved in even more devious directions than the ones I've found.

Simon St.Laurent
"Every day in every way I'm getting better and better." - Emile Coue


Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member