[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Is google a conceptual graph engine?


conceptual graph

My comments:

On Mon, 6 Oct 2003, Didier PH Martin wrote:

> Now if we are using xlink, some additional information can be added
> <a xlink:type="simple" xlink:href="index.html" xlink:role="partOf">XML
> Guide</a>
> Since the source is currentDoc and the destination index.html, then the
> conceptual graph for this statement is:
> [currentDoc]->(partOf)->[index.html] --- currentDoc is part of index.html
> Which could make sense if we consider that the first document represents the
> cover page or that it is the domain's table of contents (most of the time,
> the document associated to the domain is also a table of contents linking to
> the other documents). Based on this premise, documents are organized as a
> hierarchy and the document associated to the domain is the root.

Note also that we need to use English words in role, otherwise whatever
little semantics we have is lost. Your partOf will probably be considered
by a search engine as one special keyword..

For example, consider a document, which google indexes as:

I am partOf xml-dev

if you search "part of xml-dev" the above document will not be returned.

google uses "anchor context" also to determine the importance of pages.

> Now the problem is, for any classification agent that in other to satisfy
> mercantile appetites (or simply to pay the monthly bills) some people
> knowing that agent are using the role to establish relationship between two
> documents will play with the system in order to get a good ranking. Some
> would reply, let's then get rid of these search engines and let's create
> autonomous agents that will travel the web to collect relevant documents. No
> problems, How long will it take for such agent to cover enough of the web to
> collect significant documents. What are your guaranties that all links will
> honestly report (by will or simply by error) their relationship with other
> documents to your agent? Your agent travel agenda may be dependent on these
> relationship types....
>
> Hummm, definitively, the semantic web is not a simple affair... As some of
> our social problems are rooted in our nature or prehistoric times, some
> problems which could potentially be a plague to the semantic web are rooted
> in today's web.

Cheating on the web and getting false importance is one of the things that
google cleverly avoids. This is how google give importance to pages - this
is the page rank algorithm..

every web page is given the same rank of say 1.

now google does the following for several iterations. In every iteration,
the rank of every page is given by the sum of the rank of pages that point
to this page. For example, if your page is pointed to by yahoo, then you
have a high rank, but if your page is pointed to say by my home page, you
will have a lesser rank.

They do this for several iterations, and finally it converges (I think
experiments as well as theoretical results show that the no. of iterations
needed is < 10, if I remember correctly, irrespective of the starting
rank for every page)..

Main thing is: it is difficult to get your page ranked highly by doing
tricks..

Note; Can someone seen any case when even page rank can be fooled? I do
not remember having seen anything..

best regards,
murali.


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.