[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: [Watchers of the Web] The evolving form of information on
Bryan asks an excellent question: What data should be collected to provide meaningful results? Bryan notes two kinds of data that could be collected: (1) File Count Data: count the number of files, i.e., count the number HTML files, count the number of MP3 files, count the number of MPEG files, and so forth. A problem to be resolved is: suppose that an HTML file contains, say, three GIF images. Do you count that as: 1 for HTML 2 for GIF Will file count yield the best data? (2) Byte Count Data: count the number of bytes of the information on the Web that is in HTML form, count the number of bytes of the information on the Web that is in MPEG form, and so forth. Would byte count yield more meaningful data than file count? Is there other data besides file count data and byte count data? If you were to design an experiment to determine the percentage of information per content type, what data would you measure? (I am not asking "how" to measure the data; I am asking "what" data you would measure) Any ideas? /Roger -----Original Message----- From: bryan rasmussen [mailto:rasmussen.bryan@g...] Sent: Sunday, May 07, 2006 10:32 AM To: Costello, Roger L. Cc: xml-dev@l... Subject: Re: [Watchers of the Web] The evolving form of information on the Web? How exactly are you defining percentage, for example percentage of actual data size would probably be quite a bit differently than percentage of number of files. Cheers, Bryan Rasmussen On 5/7/06, Costello, Roger L. <costello@m...> wrote: > Hi Folks, > > There are over 350 different content (MIME) types. Some common content > types include HTML, XML, GIF, JPG, JPEG, MP3, MPEG, RSS, SVG. > > Information exchanged on the Web is in the form of one of these content > types. (Sometimes an information exchange contains a collection of > items, each item with different content type.) > > I would like to know: > > Of all the information being exchanged on the Web: > > what percentage of the information is in the form of the HTML content > type, what percentage of the information is in the form of the XML > content type, what percentage of the information is in the form of the > GIF content type, what percentage of the information is in the form of > the MP3 content type, what percentage of the information is in the form > of the MPEG content type, what percentage of the information is in the > form of the JPG content type, and so forth, for all the content types. > > I speculate that the percentages are something like this: > > Content type Percentage > --------------------------- > HTML 90% > JPG 2% > JPEG 2% > GIF 2% > MP3 2% > XML 1% > ... > > However, that's purely my guess. (What is your guess?) > > In addition, I am interested in seeing how the percentage is changing > over time - I am interested in seeing the evolving form of information > on the Web. > > Has anyone done such an investigation? > > /Roger > > ----------------------------------------------------------------- > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an > initiative of OASIS <http://www.oasis-open.org> > > The list archives are at http://lists.xml.org/archives/xml-dev/ > > To subscribe or unsubscribe from this list use the subscription > manager: <http://www.oasis-open.org/mlmanage/index.php> > >
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|