[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Statistical vs "semantic web" approaches to making sense o
Mike Champion wrote: > > This raises a question, for me anyway: If it will take a "better Google > than Google" (or perhaps an "Autonomy meets RDF") that uses Baysian or > similar statistical techniques to create the markup that the Semantic Web > will exploit, what's the point of the semantic markup? Why won't people > just use the "intelligent" software directly? Wearing my "XML database > guy" hat, I hope that the answer is that it will be much more efficient and > programmer-friendly to query databases generated by the 'bots containing > markup and metadata to find the information one needs. But I must admit > that 5-6 years ago I thought the world would need standardized, widely > deployed XML markup before we could get the quality of searches that Google > allows today using only raw HTML and PageRank heuristic algorithm. > > So, anyone care to pick holes in my assumptions, or reasoning? If one does > accept the hypothesis that it will take smart software to produce the > markup that the Semantic Web will exploit, what *is* the case for believing > that it will be ontology-based logical inference engines rather than > statistically-based heuristic search engines that people will be using in > 5-10 years? Or is this a false dichotomy? Yes this is an entirely false dichotomy but you've asked an extremely important question. Forget all the hype that we've been hearing about the SW/AI etc and let's look at what the current reality is -- OWL is *fundamentally* about classifications. OWL "reasoners" are rightly termed "classifiers" but OWL doesn't employ statistics -- a thing is or isn't a member of a class. To link OWL type classifiers with real world data, there must be a leap that puts something into a class in the first place and this is where statistical-type processors might function. Let's use the following example: Suppose we have a bunch of noisy binary data about a group of people some of whom let's say have SARS, some of the data might be audio, some video, some text etc etc. Now suppose we have a statistical process that is able to cluster individuals together in groups. This processor might emit the following class: <owl:Class rdf:ID="Foo"> <owl:oneOf rdf:parseType="Literal"> <ex:person rdf:resource="#Bill"/> <ex:person rdf:resource="#Dave"/> <ex:person rdf:resource="#Sue"/> <ex:person rdf:resource="#Nancy"/> <ex:person rdf:resource="#Freddy"/> <owl:oneOf> </owl:Class> our reasoner might be able to derive that <owl:Class rdf:ID="Bar"> <owl:intersectionOf> <owl:Class rdf:resource="#hasCough"/> <owl:Class rdf:resource="#hasFever"/> <owl:Class rdf:resource="#hasVirus.x233444"/> ... #Foo owl:subClassOf #Bar and even, in the proper circumstances that... #Bar owl:sameClassAs #SARS so the Bayesian/statistical processes might be very well used to jumpstart a logical classification process that tells us something quite useful. Jonathan
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|