Re: URIs harmful (was RE: Article: Keeping pace with
Hmmm ... On Fri, 2002-07-19 at 22:03, Uche Ogbuji wrote: > Simon wrote: > > Namespaces are probably the worst place where this pollyanna attitude > > has smacked XML, but their progeny, QNames, offer their own set of > > problems. [snip] > The harm of URIs is rather well contained when we apply to them the same > attitude the loosely-coupled clique applies to XML itself. Let each person > use them as he pleases and don't try any overarching design of URIs. The key > is in loose coupling between signifier and signified, and between the agent > granting the name and the agent using the name. Tight coupling between > signifier and signified is one of my quarrels with Topic Maps. Tight coupling > between the granter and the receiver of the name is one of the reasons I'd > rather the W3C and others didn't address URI issues by fiat, even to squash > 3000-message threads. Err, yes and no, I think. Sending "URL" off to "URI-land," in which nothing can be known about what's inside the box, leads to unpleasant results not only due to the confusion over dereferencing, but due to the change in semantic. A URI used as a namespace is officially a string, and you can only do string comparisons. Case is significant. In most of the URL formats that follow "common internet format," case is variably significant. Since the hostname is defined via DNS, it isn't case sensitive; it inherits that from DNS. WWW.W3.ORG is *identical* to www.w3.org, because the limited subset of permissible characters in DNS defines it so; upper- and lower-case characters are identical. Note that, for most schemes, the scheme is a specific identifier (case-sensitive), and the username and path portions, where they exist, are also case-sensitive. For SMTP addressing, username is officially case-sensitive (but is often resolved in case-insensitive fashion in the field ... but that's outside the spec, let's stay inside). There's further confusion, because, according to DNS, www.w3.org and www.w3c.org are the same thing. Or, on my local network (won't resolve for any of you, sorry), www.talsever.com == ftp.talsever.com == ns2.talsever.com == xfs2.talsever.com == log.talsever.com == talifane.talsever.com. Using any of these as part of a URL which will be subjected to the resolution algorithm will result in certain things coming up identical ... but string comparison won't. Note that these are separate, but related problems. In the case of resolution, one might (as W3C seems to have done) rule out application of a normalizing algorithm to the URI, even though each one identifies its preferred algorithms as the initial element of the string representation. The widespread use of hostnames in the common internet format for URLs, and W3C's recommendation that these are the preferred form (because the publisher "owns" the namespace by virtue of owning the domain), makes the failure to recognize and handle the rules of normalization for DNS less than entirely compelling. More or less the same is true of encoding issues, whether they are url-encoded or quoted-printable encoded. The namespaces rec specifically states that URI reference identity requires character-by-character identity, and it appears that there has been discussion within the TAG about the potential difficulties of doing anything more complex. There is clearly a great deal of complexity ... but it gets easier and easier to challenge the claim that "this is a URI reference" the further that the namespace string's semantic drifts from the semantic of a URI. A namespace name, in fact, is a thing that has URI syntax. Only. It isn't a URI, or a URI reference, it is a namespace name, which is defined to have URI syntax. If I happen to have a URL object off over here that's intended for use (location of a resource, that is), it just isn't safe for me to compare its string form with a namespace name. Which is to say, I don't think it's really an issue of coupling, but an issue of ambiguity, as Simon (and Len) originally suggested. Using a form (syntax) that carries extremely heavy connotations of an associated semantic, and violating that semantic (here I'm not speaking of the location algorithm, but of case-sensitivity, encoding, and resolution only, mind), is just guaranteed to produce confusion. Witness the 3000-message thread that Just Won't Die (and TBL reopened it with a suggestion that "relative URIs", an utterly *meaningless* concept when namespace names have been divorced from URI semantic (say "relative string" and "absolute string" and see what meaning you can discover), are not all that bad after all ... *sigh*). Amy! (also writing email at an hour when she should be snoozing ... if only it would *rain* and drop the temperature into the range of bearable ...) -- Amelia A. Lewis amyzing@t... alicorn@m... What's the end of a story? When you begin telling it. -- Ursula K. Le Guin
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format