[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: [Watchers of the Web] The evolving form of information on
The data leaks across the transforms in that model. Is the dependent relationship of information to format or of choice of format to author at a point in time and space? What is the coupling relationship of format to information? len From: Costello, Roger L. [mailto:costello@m...] Hi Folks, Excellent discussion! I would like to give an example that demonstrates the data that I think is needed to make a meaningful statement about the form of information on the Web today. Example: Today cnn.com is running a news story about using quantum science to determine the best way to score a goal in soccer. CNN allows you to consume the news story in any of these forms: - HTML - audio (MP3) - video (MPEG) - RSS Suppose that we monitor all the requests for that news story. By the end of the day, here are the numbers: - 50 clients have consumed the news story in HTML form - 20 clients have consumed the news story in audio (MP3) form - 10 clients have consumed the news story in video (MPEG) form - 20 clients have consumed the news story in RSS form If these numbers represented a statistically significant sampling of the Web, then we could state: "On May 8, 2006 the information on the Web took this form:" Content Type Percentage --------------------------- HTML 50% MP3 20% MPEG 10% RSS 20% Obviously this isn't a statistically significant sample, but I think that it does show the data that is needed. So, let me try to state the experiment more precisely: Of all the documents/files that are exchanged on the Web over a 24 hour period, - what percentage of them are in the form of HTML, - what percentage of them are in the form of XML, - what percentage of them are in the form of MP3, - what percentage of them are in the form of MPEG, - what percentage of them are in the form of RSS, - and so forth for all the content types. Now let me introduce a slight complexity to the above example: suppose that the HTML form of the above news story contains two GIF images. Since 50 clients consumed the HTML form, 100 GIF images were consumed. If we just count documents/files then in the 24 hour period: - 50 HTML documents were consumed - 100 GIF files were consumed - 20 MP3 files were consumed - 10 MPEG files were consumed - 20 RSS documents were consumed Now the percentage is this: Content Type Percentage --------------------------- HTML 25% GIF 50% MP3 10% MPEG 5% RSS 10% Somehow this doesn't "feel" right. The news story is being carried by the HTML document, the GIF images were just add-on. It doesn't seem right to say: "On May 8, 2006 50% of the information on the Web took the form of GIF images." Perhaps a content type that is used within another content type should be weighted differently? For example, perhaps we should give a "dependent content type" a weight of 0.1 Now the data looks like this: - 50 HTML documents were consumed - 10 GIF files were consumed (100 * 0.1 = 10) - 20 MP3 files were consumed - 10 MPEG files were consumed - 20 RSS documents were consumed And then the percentage is this: Content Type Percentage --------------------------- HTML 45% GIF 9% MP3 18% MPEG 9% RSS 18% This feels more correct to me. What do you suggest? /Roger ----------------------------------------------------------------- The xml-dev list is sponsored by XML.org <http://www.xml.org>, an initiative of OASIS <http://www.oasis-open.org> The list archives are at http://lists.xml.org/archives/xml-dev/ To subscribe or unsubscribe from this list use the subscription manager: <http://www.oasis-open.org/mlmanage/index.php>
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|