[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Something altogether different?


music predictor
> I believe we can use vector-space model only when the document collection
> is "homogeneous" in some manner.. and has repetitive words etc.
>
> Also note -- vector space model, you have to obtain rank of documents in
> real-time given a query.

Cohen's '99 WHIRL paper discusses the ranking heuristics, the storing of
similarities instead of computing them in real-time, and the use of views to
persist information about the highest-scoring answers:

"Fortunately, in most cases, it is not necessary to compute all answers to a
query, as only the high-scoring answers will be of interest. WHIRL's inference
algorithms are thus designed to finds a few good answers to a query, without
generating all possible answers. The operations most commonly performed by a
user (or program) interacting with WHIRL are to define and r-materialize views.
To r-materialize a view, WHIRL finds the "r" highest-scoring ground atoms "a"
associated with a view, and store those facts in the EDB (extensional database)
for later use."

> For other metrics such as say pagerank, rank of documents can be
> pre-computed, and we can use better algorithms based on this property.

In the "Recommending Music by Crawling The Web" paper, Cohen and Fan researched
music preferences by spidering the web and using four different scoring
algorithms: popularity, K-nearest neighbor, weighted majority and a extended
direct Bayesian prediction.

In a 1998 paper, Cohen, Shapir and Yagir discussed the use of a preference
function when determining ranking (excerpt below):
http://citeseer.ist.psu.edu/cache/papers/cs/17244/http:zSzzSzdnkweb.denken.or.jpzSzboostingzSzpaperszSzCohSchSin98.pdf/cohen98learning.pdf

Learning to Order Things
There are many applications in which it is desirable to order rather than
classify
instances. Here we consider the problem of learning how to order, given feedback
in the form of preference judgments, i.e., statements to the effect that one
instance
should be ranked ahead of another. We outline a two-stage approach in which one
first learns by conventional means a preference function, of the form PREF
... Nevertheless, we describe a simple greedy algorithm that is guaranteed to
find a
good approximation. We then discuss an on-line learning algorithm, based on the
"Hedge" algorithm, for finding a good linear combination of ranking "experts."




PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.