Re: Converting HTML to plain text
I can constrain HTML pages to be valid XML. So, the hard part is solved.
If tables weren't an issue, I think "lynx -dump file.html" would work for you.
To deal with tables, you could try converting to groff format and using
groff's "tbl" pre-processor to format your tables.
Larry Kollar k o l l a r @ a l l t e l . n e t
"The hardest part of all this is the part that requires thinking."
-- Paul Tyson, on xml-doc
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format