[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Names and schemas
Through discussions on XML=DEV, I believe that I have clarified my own thinking on names, namespaces and schemas quite a bit. The fog is lifting. I believe that through a careful agreement upon and application of definitions, we can get rid of most complaints about the namespaces proposal and remove all overlap between that proposal and things like architectural forms and XSchemas (under development). Of course my definitions are context=based. Semioticians might disagree with them, but I think that they are sufficient for those of us in the markup language design business. Definitions =========== Name: (for our purposes) A unicode string that refers to something. Example: FOO Real object: (f.o.p.) A resource the computer can process. Example: a Java class or XML entity. Conceptual object: (...) A resource the computer cannot process (yet). Example: the meaning of the English word "ship" Namespace: A function that maps names to real or conceptual objects. Examples: The domain name system. A file system. A particular directory in a file system. Declaration: An assertion that a real or conceptual object exists (in reality or as a concept!). Example: Element type declaration. Definition: An assertion of what an object *is*. Definitions make things real to the computer. Examples: Java class definition. External entity definition. Directory: (or dictionary, vocabulary) A document that declares objects and/or defines real objects. Examples: A DTD. /usr/dict Schema: A document that defines a set of data objects (and thus implicitly defines a truth value: "is object X in the set.") Note that schemata do not necessarily (and, in generic markup, will not usually) DEFINE objects...it defines a set. All it does for individual objects is report whether they are in the set or not. Alternately, you could say that it constrains them. Implications ============ When I teach or write about SGML DTDs, I always says: "The DTD declares what elements are allowed and what are the contraints on how they can be used." It is only today that I recognize that these are two different responsibilities. The first is the role of a *directory* and the second of a *schema*. There is nothing wrong with the same language performing both tasks. It can be very convenient. But we must be clear that they are two tasks. Note that these definitions have the potential to sweep away the confusion about "multiple definitions" and "multiple inheritance" in the namespaces proposal. It only makes sense to *declare* each name once. Once it has been declared it has been declared. The software knows it exists. It only makes sense to *define* each name once for the same reason. A name can be bound to only one object. But it makes perfect sense to have multiple schemas for a particular object. The same object could be constrained in a hundred different ways. It's content model could be check by a DTDs. RDF schemata could check that it fits into a reasonable logical meta-data framework. A linking schema could check that if it is a link, it is a "correct" one. etc. etc. This is why you can attach multiple SGML architectures to a document (they are achemas) but you can only attach one DTD (it is both a directory AND a schema). This is also why the namespaces proposal and architectures need have no overlap. The one is about combining directories. The other is about constraining the named objects. Here is an example of a directory (dictionary) that would not be a schema: <!ELEMENT abc> <!ELEMENT def> <!ELEMENT ghi> <!ELEMENT jkl> Here is another: abc def ghi jkl Of course, a directory *could be* a schema. As I said before, combining them can be quite convenient. But a schema could also exist which did not constrain names at all! For instance, a Java class that mapped document instances to truth values would be a schema (albeit a hard to work with schema!). Note that declarations for objects will be the norm in XML applications. Definitions will be quite rare. Very few of the things that must be expressed in markup will be expressed in terms that the computer can understand. This is why XML does not have "element type definitions", but rather declarations. The definition, if it xists, is in the brain(s) of the author(s). An exception would be where an element is "defined by" a Java class or RDF schema. Implications for the namespaces draft ===================================== The namespaces proposal was always supposed to be about "naming things accurately" and not about competing with schema languages. Nevertheless, this separation of church and state is not complete. The namespace proposal *does* promote the idea that an object should have a single schema. It should not. Luckily, this is easily fixed. All we need to do is take all normative references to the word "schema" out of the spec. In some cases, they can be easily eliminated. For instance the SRCDEF could be eliminated entirely. The role of the namespaces proposal is simply not to point to schemas. If the SRCDEF is to be retained, then it should point to a *directory* (or dictionary) which is not necessarily a schema. (but I think that the FIRST URI should point to the directory) Here's another example of what must be changed: "We envision applications of Extensible Markup Language [XML] where a document contains markup defined in multiple schemas, which may have been authored independently. One motivation for this is that writing good schemas is hard, so it is beneficial to re-use parts from existing, well-designed schemas. Another is the advantage of allowing search engines or other tools to operate over a range of documents that vary in many respects but use common names for common element types." The two sentences should be reversed and modified slightly. Verifying that a document conforms to one or more schemas is simply a special case of "allowing tools to operate over a range of documents that vary in many respects but use common names." It need not (and probably *should not*) be priviledged in the namespaces proposal. It is this type of language that makes people think that namespaces are a competitor to, or replacement for, architectural forms and other schema languages. Similarly, Section 2.5 presumes that every element is constrained by a single schema. But we know that many will live in multiple schemas. Rather, it should refer to "directories" (or one of the synonyms). It makes sense for an element to be declared in, or defined in, a single directory. Paul Prescod - http://itrc.uwaterloo.ca/~papresco Three things it is far better that only you should know: How much you're paid, the schedule pad, and what is just for show xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|