[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Public Identifiers
At 03:05 PM 9/18/98 +0100, Michael Kay wrote: >I think this response is referring to SGML rather than XML. >There is no SGML declaration in XML. There is no normative >link between XML and SGML, and therefore no normative link >between XML Public Identifiers and SGML FPIs. This is not true. From the published recommendation (italics mine): 1. Introduction Extensible Markup Language, abbreviated XML, describes a class of data objects called XML documents and partially describes the behavior of computer programs which process them. *XML is an application profile or restricted form of SGML, the Standard Generalized Markup Language [ISO 8879]. By construction, XML documents are conforming SGML documents.* >XML does not require a Public Identifier to be either public >or an identifier; you can put anything in there that you >like, and it has no defined meaning. Tim Bray's annotated >XMl spec (on www.xml.com) has this to say: Here I agree. Public identifiers are, conceptually, the same as URNs, that is, they are names that are intended to be indirected to their actual system ID, rather than being direct references to storage locations, as URLs normally are. However, as Dan Connoly and Tim B-L have argued, there's no *functional* difference between a URN and URL because persistence is always a function of the owner of the resource and cannot be guaranteed simply by the choice of name. Thus, at most, the URN/URL or public ID/system ID distinction can only express *intent*, it cannot guarantee results. It is a fact of life that any storage addressing scheme (or, in fact, any addressing scheme at all) must include some notion of indirection. Both SGML and HTTP do this and *neither* define the mechanism by which the indirection is implemented or managed. In SGML, there is a requirement that entity managers provide some mechanism for resolving public IDs to system IDs, but ISO 8879 does not define a mechanism. Likewise, HTTP provides a mechanism by which a server can report that a URL has been redirected (the 300-series messages) but doesn't define the mechanism by which a server actually manages the redirection itself. Thus, the unavoidable conclusion is that system IDs can be just as indirect, and just as persistent, as so-called "public" IDs. The only real difference is what bit of software gets the value of the ID to resolve. There is a useful notion of "published" names, that is names that the resource owner or name owner (they may not be the same entity) assert will be persistent, but there is no standard or even convention for making that assertion. The original idea in SGML was that public IDs would be used for the names of "published" things, that is, resources that are available beyond the local scope of the resource owner. However, that original intent got lost in the more immediate need for general name indirection that public IDs provided (because SGML systems are required to provide some sort of mechanism). My conclusion at this point is that the URN/public ID distinction is not helpful because it merely confuses the issue without actually solving any problems. The only thing public IDs did was force vendors to provide *a way* to do name indirection, which you do need on brain-dead operating systems that lack something like symbolic links (which includes both VM/CMS and DOS/Windows). If operating-system filename indirection was a universal service, you'd just use that to manage redirection of entity storage IDs. At the time SGML was developed, it certainly wasn't universal and it may not have even been known outside of Bell Labs (I don't remember precisely when Unix went public). In hindsight, it's clear to me that we never should have allowed public IDs in XML. Oh well. This is not to say that the URN idea is totally useless--it's very useful to have a syntax for saying what name space a particular name is unique within, which is really what URNs do. However, I do have a problem with putting all of that information in a single string--it too severely limits your choice of syntaxes. I would much rather have some sort of name structure, such as: <urn:address id="local-id-for-remote-resource"> <urn:name-domain>ISBN</urn:name-domain> <urn:name>ISBN 0-1233456-123-0</urn:name> </urn:address> The "name-domain" element names the domain of names in which the name is unique (e.g., ISBN numbers in this example). The "name" element holds the name itself. By using element content rather than an attribute, there are no syntactic restrictions on the name (it could even have structuring subelements). You could also combine names together to form larger, multi-part addresses, if necessary. Now I can refer to any resource in any name space regardless of the syntax the name-space uses for its names. Of course, there is still a problem with naming the name spaces, but that can be solved either by providing a general "name space registration service" ala DNS or by simply defining in the relevant standards what the naming authories are (as ISO 9070 does--9070 being the standard that defines the rules for SGML public identifiers). [Note that I don't use the term "naming authority"--the same name space may recognize several naming authorities, as is the case for SGML public IDs.] Remember: there's no magic to URLs or URNs--they're just identifiers that some piece of software has to map to bytes at some point. The only real question is "is the pointer to the bytes also meaningful to humans or is it only for machines?" URLs are intended to be "opaque", meaning that there is no reliable intelligence in them. URNs are intended to be "meaningful" such that a human observer might have some clue as to what the resource is at the other end of it. This is a useful distinction but it doesn't require making the distinction at the point of reference (e.g., the PUBLIC/SYSTEM distinction SGML and XML make). It is sufficient to have the distinction be inherent in the form of address you're using, which means you need a way to declare what the form is, which is what my example above does. Cheers, E. -- <Address HyTime=bibloc> W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 www.isogen.com </Address> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|