|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Something altogether different?
One disadvantage of term-based weighting or vector space model is the well-known example cited in the Google's original paper (rather sales pitch??) -- A document with only the words "Bill Clinton [expletive deleted]"; as opposed to the actual white house page was considered more important for the query "Bill Clinton" (when Clinton was the president) I believe we can use vector-space model only when the document collection is "homogeneous" in some manner.. and has repetitive words etc. Also note -- vector space model, you have to obtain rank of documents in real-time given a query. For other metrics such as say pagerank, rank of documents can be pre-computed, and we can use better algorithms based on this property. best, murali.
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








