[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Can Searchbots Find Web Pages That Aren't Linked To?

  • From: "bryan rasmussen" <rasmussen.bryan@g...>
  • To: "Costello, Roger L." <costello@m...>
  • Date: Mon, 10 Mar 2008 08:43:48 +0100

Re:  Can Searchbots Find Web Pages That Aren't Linked To?
Seconding, especially * some search engines (and many viruses) probe
for common filenames. This method could be extended to use common
dictionary attacks like methods to find uris (however due to lack of
utility I suppose that will not happen for at least another 5 years,
at some point the payback might be useful to do)

The question is Can they find things that are not linked to, the
answer to that question is yes.

If the question was Will they find things that are not linked to, the
answer MIGHT be no.

Cheers,
Bryan Rasmussen

On Mon, Mar 10, 2008 at 12:58 AM, Liam Quin <liam@w...> wrote:
> On Sun, Mar 09, 2008 at 07:45:03AM -0400, Costello, Roger L. wrote:
>  > Hi Folks,
>  >
>  > I was interested in knowing if searchbots can find web pages that
>  > aren't link to.
>  >
>  > So, I conducted a simple experiment:
>  >
>  > http://www.xfront.com/can-searchbots-find-unlinked-web-pages/index.html
>
>  The conclusion doesn't entirely follow.
>
>  First, as others have said, check your server logs.
>
>  Second, there are multiple discovery methods that Web search
>  engines can use.  Following links is one of them and of course
>  the best known, but others include
>
>  * if there are ads on the page, the URL becomes known when the
>   adverts are shown
>
>  * if there are links on the "hidden" page to Web pages on other servers,
>   and you follow them, most Web browsers send the linked-to server
>   the URI from which the link was followed, as part of the HTTP
>   header (it's called the referrer, the HTTP Referer [sic] header).
>
>  * if you mention the URI in email it can get found :-)
>
>  * if the files are in a directory that can be listed, search
>   engines will find them -- e.g. if you make
>       http://www.example.org/people/friendly/david.html
>   public, the search engines might well look for
>       http://www.example.org/people/friendly/
>   and
>       http://www.example.org/people/
>   so if any of those gives a listing of directory contents, the
>   search engines will explore.
>
>  * some search engines (and many viruses) probe for common filenames.
>
>  Without such hints, though, neither the search engines nor anyone
>  else will find the file, regardless of format.
>
>  Liam
>
>  --
>  Liam Quin, W3C XML Activity Lead, http://www.w3.org/People/Quin/
>  http://www.holoweb.net/~liam/ * http://www.fromoldbooks.org/
>
>
>
>  _______________________________________________________________________
>
>  XML-DEV is a publicly archived, unmoderated list hosted by OASIS
>  to support XML implementation and development. To minimize
>  spam in the archives, you must subscribe before posting.
>
>  [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
>  Or unsubscribe: xml-dev-unsubscribe@l...
>  subscribe: xml-dev-subscribe@l...
>  List archive: http://lists.xml.org/archives/xml-dev/
>  List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
>


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.