|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: XSchema Question 1: RDF
> From: Tim Bray > RDF is painfully simple, conceptually. And Lisa is correct in saying that > the syntax is (IMHO unnecessarily) kinda ugly; I think there are good > reasons to expect improvement. My impression of RDF is that it has a superficially appealing model, but that its current syntax is so bad it will cause just as many problems as it solves. I have been told that the syntax was designed the way it is because that is what the metadata community, whoever they are, demand: the experience of the markup community is regarded as of secondary interest, I guess because RDF is regarded as so novel that mankind has never embarked on anything like it. But I am being too bitchy. In particular, RDF has the big problem that it is a system of serialization, not of markup. The difference is this: in a markup language, the data comes first, and the trick is how to reveal the structures which are of interest: the organizing principle for the schema is the natural structure of the data. In a serialization language, the data exists pre-chunked and pre-labelled: so having a nice manipulatable schema system (e.g. relational tables) becomes the organizing principle of the schema. So, for example, the <RDF:RDF> element groups things together: it has no real purpose, and implements the built-in assumption that data is grouped together. But anyone who has marked up text knows that interesting data is plonked all over the place: often duplicated and often partial or wrong. Hence in SGML the emphasis on external markup of relationships: HyTime, XML extended pointers, and so on. The RDF syntax is IMHO completely skewed to the problem of "how can we make something that will not upset HTML?": it is starting off crippled. Syntax is not a trivial issue. RDF should use processing instructions or just some fixed attributes. I cannot see why they insist on using elements: they confuse and disguise the non-RDF element structure. If people are interested in semantic markup, they would be better looking at Topic Navigation Maps for the immediate term. You can find it at http://www.ornl.gov/sgml/wg8/document/1950.htm TNM can be analysed in terms of the RDF categories: it would make an excellent external syntax for RDF. > But it is easy to tell if something can easily be made into RDF. Here's > the test: if what you are building can be expressed as a bunch of 3-tuples > > (object, propertyname, propertyvalue) > > then it's RDF-able. Otherwise it's not. I thought Lisa's comment about it being early yet was interesting. In fact, ontology and metaphysics is one of the great Western traditions, from the Greeks until now. And AI has been studied for almost 30 years: based on the history of the study of knowledge representation, there is no reason to expect any great speed. Artificial Intelligence diverged into three fields: adaptive systems, rule-based systems, and knowledge representation. Adaptive systems have flourished unseen (the "adaptive equaliser" in your modem is a "neural net", genetic algorithms are used in stock markets), but rule-based systems and knowledge representation AI work foundered. (I see rule-based made a little resurgence again recently in the guise of "data mining".) The reason was because it was so difficult to capture enough knowledge. I think RDF is an attempt to create a massive world-wide knowledge base, so that old AI hacks will have something to do with their time. The trouble is that even if we do have information modeled in RDF, there is every chance that unless the categories they express are consistant and appropriate to the AI task intended, the AI will be fed skewed or incomplete information. A lot of problems are very domain-specific: having an incomplete knowledge base means that searching on that base can only be done with a degree of tentativeness. (In Topic Navigation Maps, ISO put in a "Weighting" attribute to express this kind of fuzziness.) It is probably the only possible strategy: enrich the data and hope that somewhere along the line enough interesting information is marked up that might be useful. Of course semantic markup would also be useful for specific in-house AI systems and scholarly work. Anyway, my gist is that semantic markup itself is not useful unless the "semantic universe" used for that markup is appropriate to your task. And even then, unless the markup is rigourous and applied to all the data consistently, it may not give the results it promises. Artificial Intellegence boffins have been working for years and come up with lots of nice side-benefits, but not delivered on their direct objectives. AI people working on this (I used to work for TI supporting AI systems in their dying days, so I think I have seen the promise and the difficulties) have constently failed to deliver: I think Apple had quite a long running project on this. The onus should be on them to provide complete solutions which have clearly addressed the technical problems (in this case, incompatability of RDF with simple schema languages) rather than on our goodwill. This is why I said RDF has a "superficially appealing model". To say that every relationship can be reduced to those tuples is quite a different thing to saying that direct representation of those tuples in markup is desirable. Behind RDF's syntax is the need not to make HTML break and the desire to be able to stream process data and stick in whatever markup at whatever point it is needed: this is why they need to create their own in-line schema declaration syntax rather than use headers or PIs. Why should XML put up with these sad constraints? There is also a third problem with the use of RDF to create a global knowledge network. That is that there are legitimate questions about data transmission speed and access: good AI searching [expletive deleted] up as much processing power as the user will bear: add on to this the transmission delays of networks and I cannot see the usefulness of such a network. Perhaps for in-house data. And I suppose you would get domain-specific search engines pre-fetching and pre-indexing data, like the HTML search engines do now. But it does mean that there needs to be an awful lot of infrastructure: Peter and Lesley's Virtual Hyperglossary is one big piece. > I think the only thing in DTD's that are not trivially RDF-able are > content models. They *are* RDF-able, but you have to use some of the > "Seq" machinery, which I find awkward. In fact *every* attempt so far > (the old DSD stuff, XML-Data, etc) to express content models in XML has > come up verbose and unreadable compared to good ol' 8879 DTD notation. > I think there's a better way, and want to see what xml-dev can come up > with. -Tim I think the trick may be to define many more special purpose schema languages rather than a single one. For example, a relational database schema language. Presumably there are proprietary and standard candidates. Vendors and users would undoubtedly appreciate it if they could continue to use their familiar schema notation and tools in XML without change. I would be far happier to allow existing schema languages than either to reinvent the standard declarations or attempt any grandiose universal schema systems. If we took this view, then the best approach for XSchema might be to 1) find the major candidate schema languages (markup declarations being the first) 2) create specific XML versions of each of them, allowing for safe proprietary extensions (i.e., extensions that do not create any interchange problems with tools which do not use the extensions). And then 3) note how the schema can be analysed according to RDF's categories. This is what Dan B. has suggested "It shouldn't be unfeasibly hard to represent XSchema ideas in terms of assertions framed as RDF triples..." (This representation can be done formally inside the DTD for XSchema too, using architectural forms. I think Elliot K suggested that ages ago.) Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








