[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: xml search engine?
At 11:43 AM 03/29/2000 +0200, Reinout van Rees wrote: >There is a problem I see for xml search engines. How are they going to >cope with all the various DTD's? They ARE going to cope, but what will >be the result? Will we have lots of small search engines searching for >information in all reinforced_concrete_supplier.dtd xml files it can >find and another for all medicine.dtd info? Will there be a few >standard elements in most DTD's to comply to some emerging behaviour >of all search engines? There are so many ways this could work out. Any >opinions? > I suspect you will see both small, topic/discipline-specific search engines and a variety of efforts to build more encompassing systems that allow cross-domain searching either by explicit agreements on mappings between different DTD vocabularies or by use of probabilistic techniques to try to translate users' queries into appropriate terminology for use in searching against different vocabularies (Profs. Buckland and Larson of my program have been doing work on this form of bridging between entry vocabularies; see http://sims.berkeley.edu/research/metadata if you want more details). Which is to say, there are times when I would want a search engine that only searched medicine.dtd documents and provided unique features that exploited knowledge of the DTD to assist me searching, and other times when I'd be willing to settle for somewhat less sophistication in order to achieve a wider search. One of the first things I learned in library school was that it was a mistake to assume that there was one best type of access/search mechanism, and that in fact it was usually better to provide several different ways of getting at data so that users can employ a search mechanism more tailored to their current needs. I think the future is likely to hold a mix of small, custom-tailored search engines and larger, more generic systems. In terms of allowing the more global search engines to assist users in exploiting XML's potential, I'm hoping that HTTP will eventually be abandoned in favor of a protocol which provides better support for information retrieval. I don't expect the world to make a mad rush to adopt Z39.50, but its concept of use attributes, which allow you to specify, for example, that you specifically want to search for a corporate author or a geographic name associated with a resource, provides a useful common language for expressing searches that can then be mapped to the elements within a particular DTD by those making the information available. It would be nice if search engines which were indexing a data repository could start by asking the repository for information on how the XML elements used in documents within the repository map to some standard set of search attributes. With any luck, as XML becomes increasingly available over the WWW, we'll see some movement towards adopting communication protocols which allow us to exploit its potential more fully. Jerome McDonough -- jmcdonou@l... | (......) Library Systems Office, 386 Doe, U.C. Berkeley | \ * * / Berkeley, CA 94720-6000 (510) 642-5168 | \ <> / "Well, it looks easy enough...." | \ -- / SGNORMPF!!! -- From the Famous Last Words file | |||| *************************************************************************** This is xml-dev, the mailing list for XML developers. To unsubscribe, mailto:majordomo@x...&BODY=unsubscribe%20xml-dev List archives are available at http://xml.org/archives/xml-dev/ ***************************************************************************
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|