[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] What's wrong with namespaces? Some observations and suggestions
Heyo. So, I've been (recently, publicly) critical of the Namespaces in XML specification. My summary version: when I try to teach someone who doesn't care about XML about XML APIs, it's namespaces where they Boggle and Fall Down. Michael Kay offered some corroboration, in noting that over fifty percent of a particular day's queries were related to namespaces, and that this was typical for queries that he handled. So. Herewith, some strongly opinionated judgments as to what's wrong with namespaces: 1) Namespaces have no need to use URIs. 2) Namespace URIs aren't URIs. 3) There is no standalone canonical form, representable in a non-XML context. 4) The distinction in semantic for the default (unnamed) prefix between elements and attributes is extremely difficult to explain, and the variation between default (unnamed) prefix and default (unnamed) namespace behavior promotes confusion. First, that namespaces have no need to use URIs: they don't. The problem is one of authority, not of identifying resources. Notably, while there has long been a convention of using "locatable" or "retrievable" URI formats (especially http: scheme URIs), there is usually nothing at the specified location to retrieve. Note the invention of RDDL, for instance, and even its name (Resource Directory Description Language). RDDL solves the problem: there is no resource for this uniform identifier to identify. The role played by URIs in XML namespaces is played by domain names in a number of languages that existed prior to the namespace specification (Java comes immediately to mind). Other solutions exist as well (Perl's uses colons! but the distribution of authority is not outstanding). The problem: a distributed authority for partition of a global namespace, with no centralized administration to resolve conflicts. The adopted solution, URIs, carry with them radical verbosity, a pre-existing BNF incompatible with the "Name" production in XML, and a potentially heavy implementation price should a language choose to use a full "URI object" to represent namespaces. Because the BNF for URIs is not a subset of the BNF for Name, it is impossible to specify a namespace without a surrounding context that binds namespaces to (Name BNF-compatible) prefixes (see item 3). That leads to item 2: Namespaces in XML are not URIs. They cannot be compared like URIs; they follow none of the semantic rules of URIs. There is no such thing as a "relative" namespace as opposed to an "absolute" namespace (but users might reasonably expect it, especially after encountering XML:Base). Comparison is for string equality. The empty string is valid (though a sentinel value). As already mentioned, even if the URI uses a well-known scheme with well-known handlers for retrieval of the uniformly-identified resource, there is no guarantee that a resource will be found there, or that if one is, it will be in a particular format, or for that matter will even be related to the namespace in question in any degree. If URIs *had* to be chosen, then they should have been chosen "whole hog." Comparison should work like URIs; they should be subject to resolution (with all the horrendous weight that that would entail); they should be full URIs, not just strings in Name BNF-incompatible string drag. Now ... the fact that these URIs aren't URIs actually provides an opportunity. With no change to parsers or anything else, it is possible, even now, to recommend "package/path" style namespaces: instead of http://www.talsever.org/xml/namespaces/edml, org.talsever.xml.namespace.edml. A change incompatible with Namespaces in XML: org:talsever:xml:namespace:edml (but many existing processors would choke). W3C could reserve the short prefix "xml" (similar to Snoracle's reservation of "java"), and could additionally operate a "short-namespace registry." And this would potentially resolve item 3 ... ... which is that there is no standalone canonical form for fully-qualified XML names (namespace + localname) outside of XML. Because URIs, which are syntax-incompatible with XML Name-s, were chosen to facilitate namespace partitioning, we can't say "http://www.talsever.org/namespaces/edml:entity". James Clark proposed a useful form: "{http://www.talsever.org/namespaces/edml}entity" ... but it can't be used as the name of an element, and because it was not incorporated as a standard canonical form in the Namespaces in XML specification, it wasn't widely adopted. Instead, languages like XPath, where the construct would be useful, instead rely upon a context to externally define the namespace to prefix mappings (or prefix to namespace mappings, if you prefer). That XPath doesn't acknowledge binding of the default prefix is just added misery. The lack of a standard canonical form is the reason for QNames in content, which violate the layering of XML, forcing awareness of namespace/prefix mappings out of the parser level into the application. Again, since namespace URIs aren't really URIs, adoption of a convention that substitutes "fully-qualified names" instead might work around this, but this is likely to be incompatible with existing namespace-aware parsers and processors (and because of the layering violation, that's pretty much every XML application out there, probably). Fully qualified names have drawbacks, too (verbosity ... if you have to say "org.w3.namespaces.xhtml:p" instead of "p", you'll be *seriously* annoyed in short order), but further refinements might be possible, there--so long as the ability to always transform to a fully-qualified name existed. The final issue I offer is the significant confusion surrounding default namespaces and prefixes, and the difference in syntax and semantics in their application to elements and attributes. Again, I would suggest that this is in large part an artifact of the requirement to map the Name BNF-incompatible URI onto a compatible prefix, but the introduction of the empty string to represent the pre-namespaces global XML namespace complicates that analysis. For elements, the default (unnamed, empty string, missing) namespace indicates the *global* namespace. In that context, it is very rarely safe to combine one vocabulary with another, unless both vocabularies have been vetted for compatibility in advance. The "stock" element of a recipe DTD has nothing to do with the "stock" element of an inventory DTD--a grocery might reasonably have both dialects in use. For attributes, however, leaving them out of a namespace is the best thing to do. It's simpler, shorter to write, and there is no danger of a 'flag' attribute on a 'vessel' element becoming confused with the 'flag' attribute of a 'note' element. Attributes are implicitly "namespaced" by their container element. Elements are not--but could be, certainly. There is no actual reason to apply a namespace to an element that is not going to be used at the "top level" (but "top level" is open to interpretation--see HTML microformats, for example, and this is particularly true where namespaces are most needed, for embedding something in another vocabulary). Relax-NG provides an explicit "export mechanism," elements that are allowed to be the starting point; that's a useful concept. Only such elements really need to be in namespaces--for HTML, one might say "block-level elements", for instance. Compare the default (unnamed, missing, empty string and no colon either) prefix. It can be bound to a non-default namespace ... for elements. Attributes with no prefix are not in the namespace bound to no-prefix; they're scoped by their containing attribute. Elements with no prefix may be in the global (that is, default) namespace, or in some other, non-default namespace. To find out, you have to have all the ancestors around. In some contexts, like XPath, an element name with no prefix *must* be in the global namespace; you can't bind the default prefix for an XPath expression--even if the prefix is bound for the original document pointed to by the expression and by the document containing the expression. I spent half an hour trying (surprisingly patiently) to explain to an astronomer (very *bright*--but not involved in XML) friend that even if she bound the default prefix of her stylesheet to the namespace URI which was bound to the default prefix of her incoming document (the XHTML namespace), "/html/body/p" still wasn't going to match anything. She had to "redundantly" bind to "h" and use "/h:html/h:body/h:p" (in the end, she didn't do it that way ... she preprocessed the incoming XHTML with a quick script that removed the namespace declaration, so she could write XPath that made sense to her ... and that was largely because my explanation devolved into chaos when she wanted to match on "@class" attribtues (no, not @h:class ... well, because attributes don't work like that ... no, don't take out the "h" binding! ... wait, what are you doing? ... oh .... fine, just strip the namespace decl and do it that way, then)). Summarizing: choosing URIs for namespaces was a mistake (in my opinion), because it meant no canonical form, and required "binding"; that these URIs aren't really URIs is both a source of confusion, and a potential opportunity. Admittedly, it may not be possible to take advantage of the opportunity, in the current state of play. My two cents (adjusted for inflation; I apologize for my well-known tendency to verbosity and sesquipedalian persiflage). Amy! -- Amelia A. Lewis amyzing {at} talsever.com "Oh, [expletive deleted]! You did it just like I told you to!" (The manager's lament)
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|