[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Re: URIs, concrete (was Re: Un-ask the question)
I'm going to be an irritating little git, Uche. Sorry. On Sat, 2002-08-03 at 15:30, Uche Ogbuji wrote: > [Amy wrote:] > > Sorry, do we have any escaping rules? I don't recall seeing such a > > thing in the Namespaces rec (I'm not considering the anyURI type in W3C > > XML Schema; does that have escaping rules? Or interesting rules for > > comparison? *sigh* Guess I'll go look ...). > > Yes we do. For example: > > http://bête.com > > Is an invalid URI, and thus an invalid namespace name. It must be escaped to > > http://b%eate.com > > One thing I don't know is how this URI restriction interacts with the recent > opening up of DNS to i18n. I can't actually find a justification for this. It isn't in the Namespaces recommendation, which is fairly silent on what a URI is. Instead, the recommendation points at RFC 2396. Section 2 of RFC 2396 discusses representations of URIs, and the generalized escape mechanism. It is important to note, however, that the RFC delegates *all* authority over which characters are reserved for which components to the component ... that is, to the URI registration specification subsection dealing with that particular part of that particular URI scheme. Or in other, other words, you may well have a requirement that URIs be legal and valid, per the scheme's constraints, before it is transformed into a namespace name. Once it has been so transformed, it is not possible to unescape it. Since the escape mechanism happens before a namespace name can be used, and there is no valid unescape mechanism, then it does not make sense to speak of an escape mechanism. What you have, instead, is just a string of characters. This string should follow the rules to create a valid URI in some scheme, encoded for computer-based transmission, but it doesn't matter, because the namespace recommendation says you can't modify it, or interpret it, in any useful fashion. Note that your example, above, is an invalid URI for computer transmission, but would be allowed, pretty explicitly, by RFC 2396. So blame the mess on TimBL, maybe. But it seems fairly clear that there is no two-way activity happening. If you get something that contains %61%6d%79, you are *not* allowed to read it as 'amy'. The namespaces recommendation gives you no permission to unescape the encoded characters. Amy! -- Amelia A. Lewis amyzing@t... alicorn@m... You like the taste of danger, it shines like sugar on your lips, and you like to stand in the line of fire just to show you can shoot straight from your hip. There must be a 1000 things you would die for; I can hardly think of two. -- Emily Saliers
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|