[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: Identity
Lars Marius Garshol [mailto:larsga@i...] writes: > | In another post on this thread, Lars Marius Garshol asked if the > | following two URLs denote the same resource: > | > | <URL: http://www.stud.ifi.uio.no/~larsga/linker/XMLtools.html> > | <URL: http://birk105.studby.uio.no/linker/XMLtools.html> > | > | My question is, does it matter? Is there a case where we need an > | application to know or think that these two URLs are the same? > > Definitely! When people do a search for 'Free XML software' on Google > I want them to get a result more or less like: > > <li><a > href="http://www.stud.ifi.uio.no/~larsga/linker/XMLtools.html"> > Free XML software</a> (<a > href="http://birk105.../">alternative</a>) > > and not to see these as two completely unrelated sites. But in this case the search engine isn't treating them as "the same thing"; it is treating them as two distinct "things", which are in some way <em>related</em>. ("ThingB" is a mirror of "ThingA".) Having two "things" which are related is a much different kettle of fish then having two "things" and trying to figure out if in fact they are the same "thing". (If they were, in fact, "the same", then there would be no need to have a link to the second "thing".) > | OTOH, are THESE two URLs the same: > | > | <URL: http://a.server.com/dir/page.asp> > | <URL: http://a.server.com/dir/page.asp?param1=5¶m2=6> > | > | This, in my [small] mind, is a much more difficult question to > | answer, but again, is there a case where we need an application to > | know or think that these refer to the same thing? > > Sure! Lots! Some examples: > > - a server log analyzer that provides a referral report should merge > references from these two But to the web server itself, i.e. a.server.com, there really would never be such a "thing" as "page.asp?param1=5¶m2=6"; there would only be a "page.asp", and anything else is just a parameter to the one "thing". (This is strictly when talking about ASP; if we talk about CGI I would be in over my head, not having dealt with it, but I have a feeling that it would be similar: to the web server, there would only be one [executable?] which would be our "thing", and anything else would be parameters.) OTOH, if we move our point of reference to an external computer somewhere, which I guess is where I've been talking from, if it is "merging" references from the two, then it is treating both as different "things". (If they're both the same "thing", then there's nothing to merge.) > - a search engine should know whether they are the same, just as with > my example above See the point I'm about to make below... > - software that builds an offline copy of a web site should know > whether to make separate copies for these two URLs > > and so on... > > And, BTW, it's by no means obvious that those two URLs really refer to > the same thing. I'm sure you'll agree that these two URLs refer to > different resources, for example: > > <URL: > http://www.80s.com/cgi-bin/valley.cgi?url=http%3A%2F%2F208.206 .40.209%2Fmyfamily%2Froad.html> <URL: http://www.80s.com/cgi-bin/valley.cgi?url=http%3A%2F%2F207.200.30.120%2F%47o ver%6Eor%2F%42ush.html> > --Lars M. Right, but this is kind of my point. If two URLs (or URIs) are character-for-character identical, then they're the same thing. If they're different <em>in any way</em>, then perhaps they should be treated as different resources, or perhaps "different but related" resources. i.e. <URL: http://a.server.com/dir/page.asp> is the same as <URL: http://a.server.com/dir/page.asp> and is different from <URL: http://a.server.com/dir/page2.asp> and is different but related to <URL: http://a.server.com/dir/page.asp?param1=5> (I readily admit that this may be a gross over-simplification.) (And I heartily wish that I could remember how this discussion got started, so that my examples could be more relevant. Did it start with namespaces? Or Schemas, and their use of namespaces? Or something completely unrelated? Even the very first "Identity" email was in reference to ANOTHER thread, so I can't even trace it back...) David Hunter david.hunter@m... MediaServ Information Architects http://www.MediaServ.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|