|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] HTML scraping
If you want to scrap everything (transforming HTML into XML), then Tidy is the right way to go (as mentioned in a previous posting). If you want to extract only SOME HTML information and map it to XML, then you should look at W4F (http://db.cis.upenn.edu/W4F/). There are a couple of on-line examples that show how to build XML gateways that transform on-the-fly HTML into XML. The XML can then be used by other applications. http://db.cis.upenn.edu/W4F/Examples/XML-Gateway/ There is also an interesting related article in JavaWorld: http://www.javaworld.com/javaworld/jw-03-2001/jw-0316-webdb.html Regards, Arnaud
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








