|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: Re: URIs, concrete (was Re: Un-ask the question)
> From: Amelia A Lewis [mailto:amyzing@t...] > Sent: Saturday, August 03, 2002 11:00 PM > To: Uche Ogbuji > Cc: xml-dev@l... > Subject: Re: Re: URIs, concrete (was Re: Un-ask the > question) > > > I'm going to be an irritating little git, Uche. Sorry. > > On Sat, 2002-08-03 at 15:30, Uche Ogbuji wrote: > > [Amy wrote:] > > > Sorry, do we have any escaping rules? I don't recall seeing such a > > > thing in the Namespaces rec (I'm not considering the anyURI > type in W3C > > > XML Schema; does that have escaping rules? Or interesting rules for > > > comparison? *sigh* Guess I'll go look ...). > > > > Yes we do. For example: > > > > http://bête.com > > > > Is an invalid URI, and thus an invalid namespace name. It must > be escaped to > > > > http://b%eate.com > > > > One thing I don't know is how this URI restriction interacts > with the recent > > opening up of DNS to i18n. > > I can't actually find a justification for this. It isn't in the > Namespaces recommendation, which is fairly silent on what a URI is. And that's A Good Thing. > Instead, the recommendation points at RFC 2396. Section 2 of RFC 2396 > discusses representations of URIs, and the generalized escape mechanism. Yes. > It is important to note, however, that the RFC delegates *all* authority > over which characters are reserved for which components to the component > ... that is, to the URI registration specification subsection dealing > with that particular part of that particular URI scheme. I disagree. Section 2: <quote> URI consist of a restricted set of characters, primarily chosen to aid transcribability and usability both in computer systems and in non-computer communications. Characters used conventionally as delimiters around URI were excluded. The restricted set of characters consists of digits, letters, and a few graphic symbols were chosen from those common to most of the character encodings and input facilities available to Internet users. uric = reserved | unreserved | escaped Within a URI, characters are either used as delimiters, or to represent strings of data (octets) within the delimited portions. Octets are either represented directly by a character (using the US-ASCII character for that octet [ASCII]) or by an escape encoding. This representation is elaborated below. </quote> So a URI by definition consists only of US-ASCII characters. Independantly of the scheme. > Or in other, other words, you may well have a requirement that URIs be > legal and valid, per the scheme's constraints, before it is transformed > into a namespace name. Once it has been so transformed, it is not > possible to unescape it. Since the escape mechanism happens before a > namespace name can be used, and there is no valid unescape mechanism, > then it does not make sense to speak of an escape mechanism. What you > have, instead, is just a string of characters. This string should > follow the rules to create a valid URI in some scheme, encoded for > computer-based transmission, but it doesn't matter, because the > namespace recommendation says you can't modify it, or interpret it, in > any useful fashion. > > Note that your example, above, is an invalid URI for computer > transmission, but would be allowed, pretty explicitly, by RFC 2396. So Nope. There's no distrinction between a "URI" and a "URI for computer transmission". There is no such thing as a "unescaped" URI. After unescaping URI-reserved characters, it stops being a URI. > blame the mess on TimBL, maybe. But it seems fairly clear that there is > no two-way activity happening. If you get something that contains > %61%6d%79, you are *not* allowed to read it as 'amy'. The namespaces > recommendation gives you no permission to unescape the encoded > characters. Indeed. Julian
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








