[Home] [By Thread] [By Date] [Recent Entries]
>>This is a question about how the search scenario will play out on the >>web once XML becomes widely implemented Some suggestions & predictions: 1. The "whole web" search services are not keeping pace with the growth of the web; they are having to index more selectively and less often. There is therefore increasing room for more specialised search services. There will certainly be some that concentrate on a particular domain (say sports results) and that get to understand the DTDs that are widespread in that domain. This may in turn act as an incentive to the standardisation of domain DTDs. 2. Search engines will probably start applying heuristics to the XML structure even if they don't know the semantics of the DTD. This comes naturally to software trying to extract information from raw text. For example, tags with recognised names such as <TITLE> may raise the weighting of the text contained therein; tags that contain small amounts of text may be ranked more highly than tags containing most of the document. 3. Some conventional tags such as <META> may emerge and be used in a wide range of DTDs if the search engines are known to apply special heuristics to them. Other conventional tags, e.g. for personal names or places, may also emerge. 4. The general public is only interested in doing simple searches. In more specialist communities, query languages that allow the tagging to be exploited will become available. Many search engines already have languages that support "field-sensitive" searching and I think these can largely be applied to XML without extension. Such queries only make sense within the context of a single DTD or a family of closely-related DTDs. The "navigational" query languages such as the XLL syntax or DSQL are too precise and too complex for free text searching. 5. XML may start to become a vehicle for a site to publish an abstract of itself. Search services, rather than indexing all the content of a site (which is becoming unviable) will start to index the published abstracts of sites, and having directed the enquirer towards a site, will then delegate the within-site searching to a search engine at the site itself. ======================================================== By the way, does anyone know of a search engine (I mean software, not a web service) that understands XML? I have been looking at writing an IFilter interface for Microsoft's Index Server and it's rather daunting, especially as MS will presumably produce one themselves within a year. ======================================================== Regards, Mike Kay, ICL xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|

Cart



