[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Updates (was Re: best practice for providing new


quicktopic
Joshua Allen suggested that "new" items in RSS feeds could be
identified by doing:
> a diff, comparing file hashes, or whatever.
	Well, that doesn't work very well with RSS as used today.
	RSS feeds generated by http://quicktopic.com provide an
excellent example of why diff, hashes, etc. don't help when working
with RSS feeds. This example should clarify the urgent need for the
combination of unique entry id and date that Atom will provide.
	QuickTopic RSS feeds are dynamically generated on demand.
Additionally, quicktopics modifies all hrefs from content in their
feeds so that they indirect through a "link.cgi" program. Presumably,
this allows them to track how frequently people follow links to other
sites. But, the real problem is that they add unique identifies to the
rewritten links. Those unique identifiers change for every version of
the file generated. Thus, any RSS item which is found in an RSS file
generated by QuickTopic will be different *every time* it is fetched
if it contains an external link.
	For instance,
	At 5:54 this evening I fetched
http://www.quicktopic.com/7/H/rhSrjkWgjnvRq.rss
	The first item contains a link to an external site. It is:
href="/cgi-bin/link.cgi?link=http%3A%2F%2Fwww.hyperorg.com%2Fbackissue
s%2Fjoho-jun17-01.html&x=215221622.4"
	At 5:56 this evening, I retched the same RSS file and the link
had changed to:
href="/cgi-bin/link.cgi?link=http%3A%2F%2Fwww.hyperorg.com%2Fbackissue
s%2Fjoho-jun17-01.html&x=215221643.6"
	Note: The difference is in the "x=" parameter which is at the
end of the two hrefs. If you hash or diff these two entries, they will
be different even though the entry itself is over 7 months old!

	If this was an atom feed, and if Quicktopics was "following
the rules" then the entry in question here would have a unique id and
a date. Rather than doing hashes or diffs of the contents of the
entry, we would be able to check that id and the modified or issued
date to determine if this was a "new" entry. But, with RSS, which has
no useable mechanism for providing unique ids (I've pointed out in
other messages why GUID is useless) and with no explicit indication of
"modified time", we're stuck believing that this and many other
messages from quicktopic are "new" every time we read them.
	Problems of "ever-changing-items" also occur on sites like
InfoWorld that insert ads into their RSS feeds. Whenever the ad
changes, any hashing based solution is going to think the item has
changed.
	My concern with this is not some "arrogant" "technology" push.
Customers complain that they are seeing the same item in their feeds
multiple times. We need Atom to prevent flooding them with duplicate
entries.
	My belief is that the failings of RSS are so great and that
the quality of service we'll be able to provide with Atom feeds is so
much greater than what we can currently provide, that RSS use will
fall off rapidly once Atom becomes established. Users will demand it.

		bob wyman


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.