|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: [Shannon: information ~ uncertainty] Ramifications to XML
----- Original Message ----- From: "Bullard, Claude L (Len)" <len.bullard@i...> To: "'TAN Kuan Hui'" <kuanhui@x...>; "Roger L. Costello" <costello@m...>; <xml-dev@l...> Sent: Tuesday, October 12, 2004 10:16 PM Subject: RE: [Shannon: information ~ uncertainty] Ramifications to XML data exchange? > Correct. Discussions of Markov models are appropriate. > Agree, temporal relevancy is very useful. Entropy varies with time as information fluctuates accordingly, but that makes system modeling interestingly complex and dynamic. If google returns results with similar relevancy factoring in datetime, IMO, it will definitely improve the information value of a query. Gyrating huge databases to sync temporal relevancy will be challenging. So understanding Shannon theory w.r.t. XML data and returning interactions with that data with greater relevancy and info. value to the user is a useful discussion. Generic modelling of relevancy w.r.t. to domain specific vocabs and XML schemas will be interesting aka semantic prediction w.r.t domain vocabs for query purposes. > > Analysis of frequency of letters is applied to text categorization, > and other pattern-based analysis used for prediction. Imagine a > tool that scans texts and based on this analysis, creates a > schematic description of the frequencies of occurrence of > some set of categorical types. Would that output be close > or equivalent to a DTD or Schema? Is a DTD/Schema a pattern > generated by a learning/negotiation process? > I think we are confusing the "structure and syntax of information" versus the "content and semantics of information" which IMO dictates the value of the information. The former only provides a structured set of vocabs for communication. Aside, this is why seaching stop words are not very useful and relevancy improves with (eg) phrase searches; the longer the phrase, the lower its probability of occurrence, the higher the relevancy of a query, the higher the value the query. The lesser the redundancy in an XML data set, the higher therefore must be its information value. > In the following, I will summarize from > > http://www.comp.leeds.ac.uk/roger/HiddenMarkovModels/html_dev/main.html > > The three principle applications of Markov modeling are: > > o Evaluation model: Discovering the probability of an observable sequence > (apply the forward algorithm) > o Decoding model: Discovering the sequence of hidden states that create the > observable > state (apply the Viterbi algorithm) > o Learning model: Given an observable state, discovering the hidden states > > (apply the forward/backward algorithm) > > In fact, the majority of texts we exchange are not random and > all choices in the Shannon sense are not equally probable. They > are 'meaningful'. Understanding how texts acquire the property > of meaning infers one understands how multiple systems, even > ones where within each system some choices are equally probable > (non-deterministic) and some are not (relative determinism) > when interacting reduce or increase entropy. > Statistically, multiple systems' interactions should aggregate up and can still be modelled as a single black box. > Determinism varies system by system. The arrow of time does not > in and of itself produce steady increases in entropy. Only Temporal correlations are usually higher with near term events versus those that has occurred some time ago. Entropy therefore should be higher when evaluated with reference to recent data. > thermodynamically > isolated systems fit that model. A system interoperating > with other systems and exchanging energy changes that outcome. > A Markov model assumes we can predict a future state based > on past states. > The XML Schema gives us a starting point in that model.
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








