|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Meta-somethingorother (was the semantic web mega-permathre
Michael Champion wrote: > OK, I'll defend it -- Tell me which of these you disagree with :-) > > â?¦ 2.1 People lie > â?¦ 2.2 People are lazy > â?¦ 2.3 People are stupid > â?¦ 2.4 Mission: Impossible -- know thyself > â?¦ 2.5 Schemas aren't neutral > â?¦ 2.6 Metrics influence results > â?¦ 2.7 There's more than one way to describe something I can ask just as easily ask you which of those are not relevant to Google or any other statistical approach (answer: none). > What is there to disagree with here? The title. But specifically - "reliable". About all you can say about statistically accrued metadata is that it's inherently more statistical - beyond that you have to get into specifics. > I personally find this the least > plausible part of the semantic web vision -- I won't even begin to > believe it until it has survived the onslaughts of the meta-spammers and > the semantic-bombers who will go after the semantic web the way they've > gone after data in meta tags and the links that Google harvests. > Dare said something the other day about having second thoughts about > Doctorow's argument because RSS feeds are an existence proof that useful > metadata is practical. One can't completely buy into either of those positions without being ignorant about a number of AI and ex-AI technologies. They're both wrong because they're both polarized in a Jerry Springer kind of way. > I'm not sure which of the straw men that > demolishes -- I'd agree that people are less likely to lie or act lazy > and stupid when they know that people(like the boss, or colleagues, or > potential employers) are watching. And anyway, RSS *is* mostly > observational metadata extracted from an article or post, or at least > generated from the same inputs used to generate the content it syndicates. If you look at RSS data for long enough you realize it's information rich and that information is being produced almost totaly as side-effect of blogging. Then maybe you go and read Metacrap, but with your eyes opened. Here's the RDF triples taken from an Atom feed with two entries, and excluding the content and the summary. 1 http://www.dehora.net/journal/ http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://purl.org/atom/ns#feed 2 http://www.dehora.net/journal/ http://purl.org/atom/ns#title "Bill de h?????ra" 3 http://www.dehora.net/journal/ http://purl.org/atom/ns#rel "alternate" 4 http://www.dehora.net/journal/ http://purl.org/atom/ns#type "text/html" 5 http://www.dehora.net/journal/ http://purl.org/atom/ns#href "http://www.dehora.net/journal/" 6 http://www.dehora.net/journal/ http://purl.org/atom/ns#link http://www.dehora.net/journal/ 7 http://www.dehora.net/journal/ http://purl.org/atom/ns#modified "2004-05-23T01:45:36Z" 8 http://www.dehora.net/journal/ http://purl.org/atom/ns#tagline "FD85 1117 1888 1681 7689 B5DF E696 885C 20D8 21F8" 9 http://www.dehora.net/journal/ http://purl.org/atom/ns#id tag:www.dehora.net,2004:/journal?id 10 http://www.dehora.net/journal/ http://purl.org/atom/ns#generator "Movable Type" 11 http://www.dehora.net/journal/ http://purl.org/atom/ns#copyright "Copyright (c) 2004, dehora" 12 http://www.dehora.net/journal/2004/05/thus_sprach_metadata.html?id http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://purl.org/atom/ns#entry 13 http://www.dehora.net/journal/2004/05/thus_sprach_metadata.html?id http://purl.org/atom/ns#feed http://www.dehora.net/journal/ 14 http://www.dehora.net/journal/2004/05/thus_sprach_metadata.html?id http://purl.org/atom/ns#title "Thus sprach metadata" 15 http://www.dehora.net/journal/2004/05/thus_sprach_metadata.html http://purl.org/atom/ns#rel "alternate" 16 http://www.dehora.net/journal/2004/05/thus_sprach_metadata.html http://purl.org/atom/ns#type "text/html" 17 http://www.dehora.net/journal/2004/05/thus_sprach_metadata.html http://purl.org/atom/ns#href "http://www.dehora.net/journal/2004/05/thus_sprach_metadata.html" 18 http://www.dehora.net/journal/2004/05/thus_sprach_metadata.html?id http://purl.org/atom/ns#link http://www.dehora.net/journal/2004/05/thus_sprach_metadata.html 19 http://www.dehora.net/journal/2004/05/thus_sprach_metadata.html?id http://purl.org/atom/ns#modified "2004-05-23T01:45:36Z" 20 http://www.dehora.net/journal/2004/05/thus_sprach_metadata.html?id http://purl.org/atom/ns#issued "2004-05-23T01:45:36+00:00" 21 http://www.dehora.net/journal/2004/05/thus_sprach_metadata.html?id http://purl.org/atom/ns#id http://www.dehora.net/journal/2004/05/thus_sprach_metadata.html?id 22 http://www.dehora.net/journal/2004/05/thus_sprach_metadata.html?id http://purl.org/atom/ns#created "2004-05-23T01:45:36Z" 23 http://www.dehora.net/journal/2004/05/thus_sprach_metadata.html?id http://purl.org/atom/ns#summary "Seairth Jacobs" 24 mailto:bill@d... http://purl.org/atom/ns#name "dehora" 25 mailto:bill@d... http://purl.org/atom/ns#url http://www.dehora.net/journal 26 mailto:bill@d... http://purl.org/atom/ns#email "bill@d..." 27 http://www.dehora.net/journal/2004/05/thus_sprach_metadata.html?id http://purl.org/atom/ns#author mailto:bill@d... 28 http://www.dehora.net/journal/2004/05/thus_sprach_metadata.html?id http://purl.org/dc/elements/1.1/subject "SemanticWeb" 29 genid:ARP58526 http://purl.org/atom/ns#type "text/html" 30 genid:ARP58526 http://purl.org/atom/ns#mode "escaped" 31 http://www.dehora.net/journal/2004/05/thus_sprach_metadata.html?id http://purl.org/atom/ns#content genid:ARP58526 32 http://www.dehora.net/journal/2004/05/mt3_are_you_not_entertained.html?id http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://purl.org/atom/ns#entry 33 http://www.dehora.net/journal/2004/05/mt3_are_you_not_entertained.html?id http://purl.org/atom/ns#feed http://www.dehora.net/journal/ 34 http://www.dehora.net/journal/2004/05/mt3_are_you_not_entertained.html?id http://purl.org/atom/ns#title "MT3: are you not entertained?" 35 http://www.dehora.net/journal/2004/05/mt3_are_you_not_entertained.html http://purl.org/atom/ns#rel "alternate" 36 http://www.dehora.net/journal/2004/05/mt3_are_you_not_entertained.html http://purl.org/atom/ns#type "text/html" 37 http://www.dehora.net/journal/2004/05/mt3_are_you_not_entertained.html http://purl.org/atom/ns#href "http://www.dehora.net/journal/2004/05/mt3_are_you_not_entertained.html" 38 http://www.dehora.net/journal/2004/05/mt3_are_you_not_entertained.html?id http://purl.org/atom/ns#link http://www.dehora.net/journal/2004/05/mt3_are_you_not_entertained.html 39 http://www.dehora.net/journal/2004/05/mt3_are_you_not_entertained.html?id http://purl.org/atom/ns#modified "2004-05-21T20:57:11Z" 40 http://www.dehora.net/journal/2004/05/mt3_are_you_not_entertained.html?id http://purl.org/atom/ns#issued "2004-05-21T20:57:11+00:00" 41 http://www.dehora.net/journal/2004/05/mt3_are_you_not_entertained.html?id http://purl.org/atom/ns#id http://www.dehora.net/journal/2004/05/mt3_are_you_not_entertained.html?id 42 http://www.dehora.net/journal/2004/05/mt3_are_you_not_entertained.html?id http://purl.org/atom/ns#created "2004-05-21T20:57:11Z" 43 http://www.dehora.net/journal/2004/05/mt3_are_you_not_entertained.html?id http://purl.org/atom/ns#summary "foo" 44 mailto:bill@d... http://purl.org/atom/ns#name "dehora" 45 mailto:bill@d... http://purl.org/atom/ns#url http://www.dehora.net/journal 46 mailto:bill@d... http://purl.org/atom/ns#email "bill@d..." 47 http://www.dehora.net/journal/2004/05/mt3_are_you_not_entertained.html?id http://purl.org/atom/ns#author mailto:bill@d... 48 http://www.dehora.net/journal/2004/05/mt3_are_you_not_entertained.html? http://purl.org/dc/elements/1.1/subject "CuzIDontLikeToDreamAboutGettinPaid" 49 genid:ARP58533 http://purl.org/atom/ns#type "text/html" 50 genid:ARP58533 http://purl.org/atom/ns#mode "escaped" 51 http://www.dehora.net/journal/2004/05/mt3_are_you_not_entertained.html?id http://purl.org/atom/ns#content genid:ARP58533 I agree it ain't pretty, but it is perfectly good machine processable metadata graph of 3-tuples (at its most basic, property-value pairs bound to a named entity). Forget RDF, you can usefully run SQL or grep piplines against this stuff; a small script will do the job most of the time. But as you build these datasets over time you get to do all kinds of things (such as type inference and joins across arbitary XML vocabularies and database schemas) if you don't mind using RDF aware tools. Moreover an RDF engine couldn't care less that the metadata was sourced from an RSS feed. I can append the windows event log or /var/log/messages to that lot without changing my scripts or RDF queries. Yes, you may not have the schema to hand for the URIs if you want to do more heavy lifting (but this problem is not restricted to RDF). As a down to earth example I tend more and more to generate logs designed to be loaded up as RDF triples. This is extrememly useful for systems management, server operations and message tracking or anything which doesn't (and shouldn't, and simply can't) care about the details of a plethora application suites, grammars, log formats, protocols, server toplogies, data-centers and so on, but do have to care about finding out what's the heck is going on. And no, you can't do this with XML+Namespaces+HTTP, not to the same extent and at the same cost. > On the other hand, Doctorow's "screed" does call into question the > WinFS vision, or am I missing something here? To what extent does WinFS > not presuppose honest, energetic, intelligent, and self-aware humans to > create the metadata it will manage and query? Not much. Most of it is flying about and never captured; you just have to know how to grab it. Huge amounts of useful metadata can be captured without ever asking users to anything extra - witness the RSS above; but the residue produced by your blogging activity is going to be a fraction of the residue produced by your operating system activity. cheers Bill -- Propylon http://www.propylon.com
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








