|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: When Searching With Google
None of which does the average user understand. The question is what would they pay in terms of learning curve or subscription costs for a search engine that behaves exactly as they think it should. The model with which they insert terms and the results are at variance; two systems are contending for the same resource. It makes it's best guess, and then the user starts searching in the results. Ok. The human is smart enough one assumes to recognize what they are looking for if they find it in the results. On the other hand, ascribing importance to the order of the results, the Google numbers, or the negative space (results not returned) is at best, a superstitious endeavor as long as the model they used to pick the initial terms and the model by which those terms are used to select results are not the same. Multiple systems contending for the same resource is a working definition of non-linearity, or unpredictable correlation. This is the well-known mental ontology contending with the search ontology problem. Then there is the further problem of source vetting. Are authors doing high quality credentialed work? Note that Michael Kay did not write that first bit below. I did. You removed my name and left Michael's. Now what does Google do with that? Possibly nothing, but a human might and it is likely to be wrong unless they follow the thread back to pick up the source. Now we have not only the mystery of Google's algorithms, but the vagaries of human authoring habits. That is why credentialed sources would be of value as part of a search filter. Let's say I am a university professor and I want my students to use the web to do research. How should I interpret their results if their sources are uncredentialed? The simple interface can lead to amplified error. The complex interface can lead to high costs and reduce the scale of use. But is it better to swap scale for reliable results? len Also I heard recently that google is making the search results adaptive based on user using some heuristics - probably domain or something..?? In short, I heard that if I search for the key words "w1 w2 ..." and someone else searches for the same set of key words, google might give different ranked results - in other words, user perceives the results ranking as non-deterministic. I am not sure if that is true actually.. can someone confirm this?? Google uses lot of proprietary heuristics for fine-tuning the search results ranking, such as tf-idf (which is greater weight to a term that occurs infrequently) which is well known in literature etc... anyways, best, murali. On Mon, 8 Dec 2003, Michael Kay wrote: > > That is why I wondered if it picked up on the topic > > word or phrase. That is likely what they are after. > > The other words are qualifiers, at least, that is > > how I use it. I was questioning the Google strategy > > because I realized I have a mental model of how it > > works, and that is how I select and enter search > > terms. It is probably not the right mental model > > but the interface doesn't make it clear, and as a > > result, its filtering strategy is opaque. The user > > does the best they can. > > Most modern search engines give greater weight to a term the more > infrequent it is in the corpus. Most also weight terms according to > where and how often they appear in the source document, and some also > recognize when adjacent words in the query constitute a noun phrase. > What google does is anyone's guess. > > Michael Kay > > > ----------------------------------------------------------------- > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an > initiative of OASIS <http://www.oasis-open.org> > > The list archives are at http://lists.xml.org/archives/xml-dev/ > > To subscribe or unsubscribe from this list use the subscription > manager: <http://lists.xml.org/ob/adm.pl> > ----------------------------------------------------------------- The xml-dev list is sponsored by XML.org <http://www.xml.org>, an initiative of OASIS <http://www.oasis-open.org> The list archives are at http://lists.xml.org/archives/xml-dev/ To subscribe or unsubscribe from this list use the subscription manager: <http://lists.xml.org/ob/adm.pl>
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








