[Home] [By Thread] [By Date] [Recent Entries]
I received this announcement via e-mail yesterday. It may (or may not :) be of interest to xml-dev and xml-l subscribers. Contact information is at the foot of the announcement. [Disclaimer: I have no affiliation with the W4F product development group. My correspondent, previously unknown to me, just happened on my website. Apologies for the cross-posting to subscribers of both lists.] >----- Looking at the Web through XML glasses, using W4F ----- > >The World Wide Web Wrapper Factory (W4F) is a Java toolkit to >generate wrappers for HTML data sources. > >Version 1.03 offers a built-in declarative mapping to XML. >Using W4F it is now possible to easily specify the translation >of HTML pages into XML documents. Moreover, the specification >gives for free the DTD. > >W4F consists of a retrieval language to identify Web sources, a >declarative extraction language (HEL: HTML Extraction Language) >to express robust extraction rules and a mapping interface to >export the extracted information into some user-defined data- >structures (text, Java objects, XML, etc.). >The wrappers are generated as Java classes that can be used as is >or integrated into higher-level applications. > >Version 1.03 provides some improved visual support to make the >creation of wrappers easier and faster. In particular, the >extraction of HTML can be done via a wysiwyg interface. > >The W4F toolkit comes as a Java package and can be downloaded from >the W4F web site. It is free for non-commercial use. >Various examples of running wrappers are also available for download >from the web site. > >Web site: >http://db.cis.upenn.edu/W4F > >Contacts: >Arnaud Sahuguet >Database Research Group, Univ. of Pennsylvania, PA, USA >sahuguet@g... >http://www.cis.upenn.edu/~sahuguet > >Fabien Azavant >École Nationale Supérieure des Télécommunications, Paris, France >Fabien.Azavant@e... >http://www.stud.enst.fr/~azavant ========================================================== John E. Simpson | The secret of eternal youth simpson@p... | is arrested development. http://www.flixml.org | -- Alice Roosevelt Longworth xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|

Cart



