[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: [Shannon: information ~ uncertainty] Ramificationsto XML d


shannon information
Roger L. Costello wrote:
> I am trying to get an understanding of Claude Shannon's work on
> information theory. Below I describe one small part of Shannon's
> work.  I would like to hear your thoughts on its ramifications to
> information exchange using XML.
> 
> INFORMATION
> 
> Shannon defines information as follows:
> 
> Information is proportional to uncertainty.  High uncertainty equates
>  to a high amount of information.  Low uncertainty equates to a low 
> amount of information.
> 
> More specifically, Shannon talks about a set of possible data. A set
> comprised of 10 possible choices of data has less information than a
> set comprised of a hundred possible choices.

> ... QUESTIONS
> 
> 1. How does this aspect (information ~ uncertainty) of Shannon's work
> relate to data exchange using XML?  (I realize that this is a very
> broad question. Its intent is to stimulate discussion on the
> application of Shannon's information/uncertainty ideas to XML data
> exchange)
> 
> 2. A schema is used to restrict the allowable forms that an instance 
> document may take.  So doesn't a schema reduce information?

I think you should be very cautious and thoughtful about trying to apply
Shannon to the sending of xml messages.  I think there are some tricky 
aspects that could make things non-obvious.  Some examples  -

1) A schema does not necessarily reduce the number of possible messages 
if it is possible to send schema-invalid messages over the channel. 
What conditions about restricting the sessage set need to be in place 
for Shannon's work to apply directly?

2) Under most schemas, there are an infinite number of possible messages 
(since most or all elements or attributes could hold content of 
indefinite length).  The usual measures of log N of log N/N aren't 
useful in this circumstance.

3) Shannon's work is usually thought of in terms of whether tokens get 
through the communications channel uncorrupted or not.  Is it 
technically correct to think of a single xml message as a token (I think 
not)?  If not, what if anything would play this role in an xml message?

I am not well-versed in this area, so I will step back and let others 
who are, do the talking.

Cheers,

Tom P


-- 
Thomas B. Passin
Explorer's Guide to the Semantic Web (Manning Books)
http://www.manning.com/catalog/view.php?book=passin

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.