[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Why not RDF for XSchema
Ok here we go: crash course in RDF -- what i learned while participating in the RDF Schema WG (DISCLAIMER IN ADVANCE -- no secrets divulged here) But first, an introduction: Simon said :-) > I'm still conducting my own full-scale investigation of RDF. I'm very > concerned about several issues: > > 1) I don't want this discussion limited to the relatively small number of > people who have figured out RDF. First of all I need to say that I'm very much in favor of RDF and what it proposes to eventually accomplish. (and I know I'm gonna catch some flack for some of my opinions below - but I'm just trying to save the rest of you time -- no one has the time to bone up on RDF real quick -- not even over the weekend. I'm not trying to be critical as much as trying to explain my experience and understanding of it and/or be corrected and enlightened about whatever I may be confused about - (thanks in advance tim and andrew) So here we go: > > 2) I don't want XSchema to need to make drastic changes if RDF moves suddenly > - it isn't complete. (From what little I've heard, this doesn't sound likely.) > I number my points from here... 1) actually, it is highly likely...almost guaranteed IMHO (you're gonna see me saying that a lot in this mailing ...) Until the syntax solidifies -- and don't hold your breath for that to happen anytime soon -- and really I think it's a good thing to wait and let RDF evolve as we figure out what we need it for -- sure beats pumping something out in a timely fashion that ultimately is neither extensible or useful -- and organizing bookmarks -- I don't care if they are the smartest bookmarks you ever did see -- doesn't count in my book as an implementation of an extensible semantically-enabled data content model -- complete with meaning and understanding...an ideal, granted, but we must reach for the stars, yes?) And nothing against either WG -- others have made comments about how slow the RDF progress is moving, but I doubt if the people making these comments realize the huge task that the RDF syntax and schema groups have before them: they have been presented with the ambitious task of integrating numerous existing semantic content models into a single unified conceptual syntax for expressing semantics in a way that is ultimately language-and implementation-independent. Another confusing item, particularly when using RDF for writing schemas, is that the various existing KR communities use many of the SAME WORDS when they mean ENTIRELY different things, which is poetically ironic considering we are trying to define a model for conveying meaning in the first place. For starters, there are something like 8 or more meanings just for the word "schema". Then we've got "inheritance". And what about classes and types -- which word do we use even though we are just trying to pick one --so the people that had a solid understanding of the properties involved say "oh, type, class, what does it matter? We know what we are talking about." and for some people, that is certainly true. But life is just so much easier if you pick one so there is no ambiguity. Ambiguity will inevitably manifest -- all we can do is systematically eliminate it as soon as such ambiguity recognized, yes?(especially syntactic ambiguity, which can be immediately remedied :-) To go on to the next layer of semantic understanding without fully comprehending the meaning of the "scope" (right word?) of the concepts underneath -- it's like getting back to the foundation of a house or getting around to grounding your electricity after you're all moved in. (insert metaphor for lacking a corelevel structure here) Anyway, taking all of the above into consideration (even the stuff Tim is going to correct me about ;-) Using RDF for anything other than an example of an example of an RDF Schema of an XSchema would be downright irrational. I'm not going to emphasize the synctactical instability except to say that it is unstable - which like i said before - is a positive and correct path of action at this ever so formative stage of the semantic conceptual game. 2) In many ways, it is almost a backwards concept to consider using RDF to define XML Schemas. This is because RDF is unique, compared to other XML implementations because, for one thing, it is *not* an XML implementation per se, although it CAN be implemented in XML (and in my opinion, should be and will be if it is ever going to be useful). So to talk about defining an XML-based application something or other using RDF doesn't mean a whole lot, and because RDF itself is a conceptual model, without a specifically-defined syntax OR model at this point...(I realize that this flexibility was designed to be one of RDF's "features", but at this early stage of its core development it simply complicates just about every aspect of actually implementing it.) EBNF (sp) notation is what's used in the spec (syntax). UML has been attempted for use in the graphical representation of its data structures, but it got messy quickly... RDF Schemas were going to used XML-compliant syntax in the beginnings of the WG, but we soon fell into a trap attempting to "wing it" (translation: we made guha and andrew do all the work :-) One could say that XML itself is unstable considering the (admittedly few...but still...) parts of it that have still yet to be defined, or are defined in an experimental manner (such as namespaces). Nevertheless, it's the best we got: RDF syntax will be most useful when it is strictly defined in XML-complaint syntax -- especially if we wish for our RDF and XML implementations to complement each other without restricting the expressiveness or interoperability of the other -- on my own site or anyone elses. IMHO 3) ...and this is a personal beef of mine The current RDF WD spends almost more than half of its "ink" defining numerous ways for authors to abbreviate their RDF syntax. Not only are the various varieties of abbreviation equally confusing (especially to those that are trying to initially learn RDF for the first time) but there doesn't seem to be really anything to gain in doing so, especially if we are in agreement that, ultimately (in a perfectly structured, gui interfaced world :-) Authoring tools will be generating the RDF after the functionality is determined. So we're not going to save any time or effort using abbreviated syntax....and we're going to screw up the interoperability of the data used in our RDF applications if everybody is abbreviating all over the place and those abbreviations are not clearly specified and accessible somewhere where we can find it when it's time to let our data mingle -- thus annihilating ANY benefit to abbreviating our rdf syntax Not to mention the obvious counterproductive nature of abbreviating a syntax that IS NOT FINISHED BEING COMPLETELY DEFINED :-) I understand how, at first glance, RDF seems to be SCREAMING to be abbreviated due to its often redundant and seemingly unnessarily verbose syntax -- verbose syntax that, at first glance now (she said cautiously) RDF doesn't seem to be really doing anything with its verbose syntax. Its syntax insists on being immediately complex, without providing a solid structure from which we can extend -- and it doesn't always map very well (algorithmically is what i think i mean, so someone CAN go from an RDF schema to a DTD -- not that they would want to, but it should someone want to, doing so should be a syntactical exercise, and not a painful one... --a systematic one! Frankly, if my implementations aren't completely interoperable with the data of the rest of the free world, they are of no use to me -- another potentially casualty of using lazy or inconsistent abbreviations. 4) The existence of the above mentioned syntax ambiguity makes it hard to construct the structure of its conceptual model -- and since the nature of RDF's design is precisely to provide a means for constructing a conceptual data model from which meaning can be derived (if my understanding is correct) -- there quickly becomes a sort of chicken and egg thing -- but using only a wing and a yolk (a wing and a prayer :-) Especially to those (like myself perhaps) currently unskilled in the field of semantics and knowledge representation. (It's kinda like a rosetta stone -- a puzzle -- except that RDF has yet to ever exist completely in one piece -- so one can burn a lot of time looking for pieces that you may NEVER find and often you realize you don't even need to find what you were looking for...) 5)Datatypes are a rather "funky" (maybe "tricky" or even "complex" is better) right now RDF has its own set of RDF-centric primitive datatypes (why the set used for XML -- as defined in XML Data -- couldn't have simply been adopted outright, I'll never know) RDF Schema too, in theory, is scheduled to have ITS own RDF Schema-specific datatypes as well as a set of primitive datatypes, which are currently not provided, last I checked) This issue is another, seemingly fundamental "core" requirement that the powers-that-be writing the spec somehow felt we could get back to, later, while continuing to moving forward despite the structural ambiguity. I figure as long as there is a means for providing a mechanism to define whatever datatypes you want whatever kind using whatever structure to define them -- as long as they can be referenced and parsed from within an RDF application -- what does it matter? (And I still feel this way.) But others -- people that know more than me about structuring semantics and sets of conditional rules that will most likely be accessed via one or more inferencing engines -- whose expressive syntax and semantic, conceptual structure WILL most likely be language and application and implementation and maybe even domain-specific and potentially fussy and inflexible-- Veterans tell me that these things can sometimes be handled more effectively using a set of built-in complex sorta-relational/conceptual datatypes) I still say give me a way to define these externally and we're in business but really, when it comes to this stuff -- to say my experience with defining semantic conceptual complex datatype structures -- defined externally or otherwise -- quite the understatement -- bordering on an embellishment! But I do understand that creating "built-in" datatypes into an ambiguious conceptual model would seem to make the uses for that datatype equally ambiguious (ie equally useless) 6) Another issue is that RDF and RDF Schema, at this point in time, (unless things have changed recently) have unique namespace prefixes -- even when addressing identical conceptual properties that the two share. (i suspect that considering this a "flaw" is perhaps required in some way -- making this item merely a case of my schema naivity rearing its ugly head) But at one point I researched the alternatives and it seemed like separating the two would only add to the already remote possibility of interoperability with other applications and that the potential for ambiguous, redundant resources could also become an issue -- somewhere down the road. 7) Another Kavetch: RDF will sometimes use more than one colon in its namespaces -- and i have seen an inconsistency across the examples provided by both specs and inconsisties between the spec examples and within the specs themselves with regard to when and under what conditions might require its lack of compliance -- another peeve of mine due to my hopeless dependence on the accuracy of such specifications. 8)RDF Schemas, at this time, do not contain a means of referencing metadata registries other than manually using URIs/namespaces (uuid won't help much with universal access) It seems to me that this would be one of those features that would be very worth while to take the time to "build-in." And also a requirement for a authoring schemas that can be useful. > 3) I don't want readers of the XSchema spec to have to bounce between it, the > RDF spec, and the XML spec to figure out what's going on. Rigorous definitely > made it into the goals, but readable and clear remains an important target. 9) RE: The having to go back and forth between specs issue: Boy! You really hit the nail on the head with that one. In order to deal with RDF in any kind of cohesive fashion -- which I figured I'd better be doing if I were to be of any use to the schema effort I found myself immersed in the following topics/resources: *Dublin Core see digression... <digression>(VERY helpful -- perhaps they have always had the right idea in terms of starting simple and agreeing on some agreed upon meanings and going from there step by step...however painful -- at least they suceeded in clearly defining the semantic concepts, however limited/simplistic -- when something is expressed its meaning clearly understood, referencible, translatable, and, eventually....extensible</digression> *Warwick Meeting of something or other *A lot of great metadata theory from that big conference in 96 *AI background stuff *Content Algebra automata theory classics *RDF syntax *RDF schema *PICS Labels (according to the charter RDF Schemas must be able to express PICS labels -- what a pain that is (currently does not do so) *MCF (the origin of many of RDF's ambiguities, in my opinion -- although it had some very progressive ideas, in its day) *MCF Tutorial (immensely helpful) *Aristotle's categories (not kidding -- thanks andrew) *XML Data *XML 1.0 *Daniel Dardiller's neato transformation paper from last fall *Nicolaus Wirths stepwise refinement paper *Mime spec, URI spec, HTTP spec (IETF) **Semantic query ontology stuff (it is NOT a tangent dammit!) **Object-oriented programming books (not just java, but C++ and component-based OO stuff in general -- another example of one of the KR worlds requiring integration into RDF's conceptual model) **Database theory and structure came into play somewhere in here (another "community") **Distributed Computing Basics (mostly machine-level considerations -- but I can see some incredible semantic possibilities utilizing DC -- but not if I can't parse my bogusly-doubly-coloned resource identifier, or find my ambiguously-defined device identifier -- or my OS-specific uuid: or yadda yadda...got I hope this all makes sense....or my dynamically-generated style sheet and/or my desired datatype reference - or SMIL streaming media source, or my architectural form....etc. etc. there are ways to always name things in what I am starting to define as an ANDROGENOUS manner -- not homogenous, or even heterogenous, but androgenous -- initially bland but ultimately dependable...) **Pattern theory and stuff written by that architect guy whose name escapes me **Organic-based information systems -- (like SGML...) And by then I realized I had come full circle. And still didn't know what schema was :-) 10) There are accessibility issues with RDF if it is not expressed in XML (and maybe even then...) that I don't think either RDF WG has been able to begin to address. (see WAI Accessibility spec) 11) Let me try to say that another, perhaps more structured way :-) At this point, RDF is half-baked ok? And we're not even sure if we have all of our ingredients yet, and all these different KR communities already have their own cookbooks that work just fine for them that don't seem to translate into more general kinds of syntactical expressions without having to compromise on the semantic meaning that can be derived from such expressions. In a consistent syntax it would make sense that this would happen, but when it happens in RDF, it does not do so in anykind of a structured way. It's harder to find holes in a wall that's already opaque, yes? Ultimately, this is a good thing, because defining RDF at the molecular level will ultimately enable the creation of complex derived semantic structures that will allow us to convey bonified deep and rich and associative and contextual CONCEPTUAL MEANING at an (ideally gracefully-degradable...) machine-readible, "smart" seamless and automatic level. Which would make sense except the problems that you run into do not occur in a uniform way. Not good for a language designed to be language and application and implementation-independent with the goal of enabling the interoperability of data between domain-specific semantic content models. unified expressive which really has some complex and intense semantic models it has been asked to somehow integrate into a unified conceptual language capable of satisfying the needs of several KR communities (databased programmers, dublin core, AI, etc...) without being to verbose (a goal that I fear is often disregarded ;-) The long and short of it is that, when I was in the RDF Schema WG, and admittedly struggling with the conceptual model of a "schema" in general -- I thought it was my own fault for not understanding schema, but once I stopped trying to learn schema in RDF and wrote schema using, say, xml data, or even a DTD...which IS a type of schema...the concept of defining a data content model was not complex at all....which has led me to believe that (DISCLAIMER: IMHO...drumroll please...) There's something casually ambiguious, inconsistent, and unquestionably about RDF's structural model. Why do so few understand RDF? Because at this point in the game, one can only conceive of the KINDS of things we WILL be able to do with it -- not much can be done now until the syntax is finished and it's conceptual (syntactical-not semantic mind you) model is completely defined. How can we even begin to construct a semantic model without a consistent and complete framework for the syntax? We can't. At one point (i don't think i am divulging any top secret WG info here ;-) guha offered to translate any existing schemas into rdf for us because none of the other WG members could figure it out either -- and we're talking after months of trying. I think this is significant. How come almost nobody understood it? How come so many of you stated that you didn't understand it, when you are able to "grasp" so many other complex and abstract ideas. Even tried and true dublin-core and database and UML guys (as opposed to "dummies" like me :-) weren't "getting it" excepting maybe guha, andrew, and ora :-) why is that? Several reasons actually: 1) RDF (and particularly RDF Schemas -- where you use rdf to actually attempt to "do something" besides simplistically describe the contents of your documents for search engines (a la the META TAG or CDF -- and hey, RDF is supposed to provide a core level semantic architecture that sits right on top of our core XML layer -- enabling our domains different semantic "knowledge representation" models to interact and "learn" from each other, yes? As it sounds now, one RDF implementation has the potential to not interoperate properly with another -- even if both "live" in the same domain and were written using the same syntax. Simon St.Laurent wrote: > > How will XSchema use/relate to RDF? > > There are several options I can see: > > 1) XSchema could be designed, top-to-bottom, as an RDF application. This is a really BAD idea for the reasons enumerated above This > wouldn't necessarily rule out defining XSchema in its own terms with an > XSchema document or providing a DTD, but it would require that participants > have significant working knowledge of RDF. On the bright side, I suspect the > W3C would look more kindly on an RDF implementation, if we can make it work. > I know everyone is very protective of RDF...and they should be, since it is very much still gestating in the womb...(I too, look forward to watching it grow up....:-) > 2) XSchema could use RDF for the descriptive information contained in XSchema > documents, but use its own model for defining elements and attributes. > Using any part of RDF's ambiguity to define even part of XSchema isn't really an option...not a useful one anyway > 3) XSchema could ignore RDF altogether. This wouldn't rule out the use of RDF > to provide metadata information about documents, but would leave RDF out of > the structural information included in XSchema documents. > Gets my vote! RDF should be able to implement whatever we decide on for XSchema, in addition to its own RDF schema-specific functionality. > 4) XSchema could allow the use of RDF as one of many ways to extend the schema> information provided. > This is a given given 3, yes? We don't want to restrict any form of extension, do we? > > Making RDF useful for this project, if we choose to use it, is going to > require the creation of a much fuller set of tutorials in particular. The > current set (listed below) is notable for its density and its lack of concrete > examples. I'm willing to work on this project (I may even be able to get paid > for some of it) if we decide that RDF is important. I'd also like to hear > about other resources already in existence. > I've been working on an RDF Schema Tutorial that kinda got back burnered when I left the group (webMethods had to bring in the big guns (Joe Lapp ;-) and to be honest the time expenditure was starving me out -- plus I couldn't write about W3C-based issues with a clear conscience being a participating WG member (i'm a by the book gal :-) -- I may make an exception to do some accessibility work....but I'm digressing again.... xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|