|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: [Shannon: information ~ uncertainty] Ramificationsto XML d
like a lot of these discussions, this one is in danger of slipping from the point and confusing the mathematics with orthogonal semantics problems. information as described by shannon is a mathematical concept. the use of information is, if you like, a syntactic and semantic problem. if you want to do an experiment of information (one that i hope to have time to do on the weekend) try this. method: 1. generate an ascii file of 1,000,000 random numbers. the same randon numbers as integers, and the same randon numbers as an xml file <numbers><number>..</number>....</numbers> 2. compress all three files using your favourite compression program(s) and compare the resulting file sizes. 3. discuss. now i haven't done the experiment yet, but if shannon is correct, they should all come out about the same size. which btw goes a long way to explaining why you can't really compress a jpeg :) rick Irene Polikoff wrote: >There are two sides here - sender and receiver. > >The uncertainty is on the sender side. The more types of messages he is >capable of sending, the more uncertain it is as to what specific message he >will be asked to send. In this example the sender is the prison. > >The information is on the receiver side. This doesn't deal with any >uncertainty the receiver may have because the information is wrong, or for >whatever reason. The receiver presumably, needs to do something with the >information it receives. The smaller the number of message types it could >receive, the easier it is for "him" to react. > >I put "him" in quotes because in the context we are discussing, both sender >and receiver are computer programs. Low uncertainty, low information makes >it easier to create software to process it. > >For the humans/businesses behind the sender and receiver, this is another >story: the reduction of information may have undesirable affects. I think >Roger's question is legitimate - yes, in some ways XML schemas reduce >information. Is it a bad thing? My guess the answer is "it depends". > >-----Original Message----- >From: sterling [mailto:sstouden@t...] >Sent: Wednesday, October 13, 2004 11:26 PM >To: Roger L. Costello >Cc: xml-dev@l... >Subject: Re: [Shannon: information ~ uncertainty] Ramifications to >XML data exchange? > > >I hate to show my stupidity, but if the message regarding the condition of >the prisoner [a constant] is irrevant to the condition it is meant to >express to the intended receiver, then the message is just as >uncertain to the receiver as is no information at all. The state of no >information is the ultimate in uncertainity. > >Do you not mean that high uncertainity equates with low amounts of >reliable information because high amounts of reliable information is >postively related to greater certainity? > >Hence, the reliability of information is a factor of the conseqence of its >use and uncertainty approaches certainty as predicable consequence of >reliance becomes more certain. > >Furthermore, if I am trying to predict the weather, it does not matter >which message or how many messages are sent, since the message is neither >related to the condition of person whose condition is to be expressed or >to the weather conditions. > >I know nothing of Shannon, but information that has no foundation ( that >is random information unrelated to the circumstances of its source or use) >is not information, it is at its very best, data. > >Information is an object with many modifers and much utility! > >My two bits. > > >On Mon, 11 Oct 2004, Roger L. Costello wrote: > > > >>Hi Folks, >> >>I am trying to get an understanding of Claude Shannon's work on >> >> >information > > >>theory. Below I describe one small part of Shannon's work. I would like >> >> >to > > >>hear your thoughts on its ramifications to information exchange using XML. >> >> > > > >>INFORMATION >> >>Shannon defines information as follows: >> >> Information is proportional to uncertainty. High uncertainty equates >> to a high amount of information. Low uncertainty equates to a low >> amount of information. >> >> More specifically, Shannon talks about a set of possible data. >> A set comprised of 10 possible choices of data has less information >> >> >than > > >> a set comprised of a hundred possible choices. >> >>This may seem rather counterintuitive, but bear with me as I give an >>example. >> >>In a book I am reading[1] the author gives an example which provides a >> >> >nice > > >>intuition of Shannon's statement that information is proportional to >>uncertainty. >> >>EXAMPLE >> >>Imagine that a man is in prison and wants to send a message to his wife. >>Suppose that the prison only allows one message to be sent, "I am fine". >>Even if the person is deathly ill all he can >>send is, "I am fine". Clearly there is no information in this message. >> >>Here the set of possible messages is one. There is no uncertainty and >> >> >there > > >>is no information. >> >>Suppose that the prison allows one of two messages to be sent, "I am fine" >>or "I am ill". If the prisoner sends one of these messages then some >>information will be passed to his wife. >> >>Here the set of possible messages is two. There is uncertainty (of which >>message will be sent). When one of the two messages is selected by the >>prisoner and sent to his wife some information is >>passed. >> >>Suppose that the prison allows one of four messages to be sent: >> >>1. I am healthy and happy >>2. I am healthy but not happy >>3. I am happy but not healthy >>4. I am not happy and not healthy >> >>If the person sends one of these messages then even more information will >> >> >be > > >>passed. >> >>Thus, the bigger the set of potential messages the more uncertainty. The >>more uncertainty there is the more information there is. >> >>Interestingly, it doesn't matter what the messages are. All that matters >> >> >is > > >>the "number" of messages in the set. Thus, there is the same amount of >>information in this set: >> >> {"I am fine", "I am ill"} >> >>as there is in this set: >> >> {A, B} >> >>SIDE NOTES >> >>a. Part of Shannon's goal was to measure the "amount" of information. >> In the example above where there are two possible messages the amount >> of information is 1 bit. In the example where there are four >> possible messages the amount of information is 2 bits. >> >>b. Shannon refers to uncertainty as "entropy". Thus, the higher the >> entropy (uncertainty) the higher the information. The lower the >> entropy the lower the information. >> >>QUESTIONS >> >>1. How does this aspect (information ~ uncertainty) of Shannon's work >> >> >relate > > >>to data exchange using XML? (I realize that this is a very broad >> >> >question. > > >>Its intent is to stimulate discussion on the application of Shannon's >>information/uncertainty ideas to XML data exchange) >> >>2. A schema is used to restrict the allowable forms that an instance >>document may take. So doesn't a schema reduce information? >> >>/Roger >> >>[1] An Introduction to Cybernetics by Ross Ashby >> >> >> >>----------------------------------------------------------------- >>The xml-dev list is sponsored by XML.org <http://www.xml.org>, an >>initiative of OASIS <http://www.oasis-open.org> >> >>The list archives are at http://lists.xml.org/archives/xml-dev/ >> >>To subscribe or unsubscribe from this list use the subscription >>manager: <http://www.oasis-open.org/mlmanage/index.php> >> >> >> > > > begin:vcard fn:Rick Marshall n:Marshall;Rick email;internet:rjm@z... tel;cell:+61 411 287 530 x-mozilla-html:TRUE version:2.1 end:vcard
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








