[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: When are <!DOCTYPE> and svg namespace references
On 3 Feb 2010, at 10:57 , Ylvisaker, Steve wrote:
The only way to be certain would be to use some system utility which notices and reports attempts to open network ports. The short answer is that none of the relevant specs themselves require without qualification that such network resources be read, but they also don't forbid it. The longer answer has several parts. (1) The presence of a DOCTYPE declaration does not, in principle, mean that the external DTD file must be dereferenced, though that is often the effect in practice. The URI "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd" given as the system identifier for the DTD must be consulted by any processor performing DTD-based validation on the data. The presence of a DOCTYPE declaration does not constitute an instruction to validate the document, and in principle it would be good if processors like Firefox allowed you to specify whether you want validation performed or not. But in practice, many programs don't provide that kind of user control; instead they assume that if a DOCTYPE declaration is present, they must or should validate the document. For such programs, a request that they read a particular document amounts in effect to an instruction that they should validate it, too, if a DOCTYPE declaration is present. Note that a program validating the document may or may not actually hit the network: the authoritative source for the document is the server identified, but if your system has a caching proxy and the DTD is in the cache, there will not necessarily be any network traffic. And software built to work with documents of a particular kind may have and consult a locally cached copy of the DTD instead of retrieving it from the network. In the case of DTDs served from W3C servers, the DTDs change very infrequently and the expiration dates are set to encourage local caching; experience on those servers shows that surprising numbers of programs and packages are willing to request the same resource thousands of times in the same minute, whether the requests succeed or fail. When this happens frequently, it can place a bit of a strain on the server involved, so well behaved software should arrange for some kind of local cache. See http://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffic for a more complete account of some relevant issues. (2) Many programs will fail gracefully (or relatively gracefully) if they can't get to the DTD. Many programs which attempt validation whenever they see a DOCTYPE declaration will shrug their shoulders and proceed without validation if they don't succeed in retrieving the required external resources (such as the DTD). The logic of this behavior is not completely clear (if you think validation is required, why would you proceed anyway if you can't perform validation?), but it's not uncommon. (3) Namespace names serve purposes of uniqueness and documentation. They will seldom need to be dereferenced. The URIs "http://www.w3.org/2000/svg" and "http://www.w3.org/1999/xlink" in your sample graphic identify certain constructs in the XML as being in the SVG or the XLink namespaces, respectively. The crucial effect of this is to ensure that when the same local name is used in two different namespaces, markup can reliably be assigned to one or to the other. There is no need to dereference the namespace URI in order for software to perform that function. Any software responsible for processing a particular vocabulary will need to know, given an element named (for example) "desc", whether it's the "desc" element they know about (e.g. the SVG desc), or some other "desc" element (any desc in any other namespace). That also does not require that the URI be dereferenced; software built to process SVG, for example, will almost certainly have the SVG URI hard-coded into it somewhere. On the other hand, namespace documents are occasionally used to provide links (e.g. via a RDDL document) to relevant resources, e.g. schema documents in various schema languages. And so software may occasionally dereference a namespace URI to see if it can find relevant resources there. And of course if a human is trying to understand what this SVG stuff is, then they might do worse than dereference the URI to see if it provides any useful human-readable information, or pointers to such information. (The SVG and XLink URIs do in fact do this.) Three of the applications, Firefox, InkScape and Adobe CS3 care about the name of the xmlns URL.
Something other than www.w3.org trips them up. Antenna House and Saxon don't seem to care.
I don't know why Antenna House behaves as it does. With the <!DOCTYPE> declaration I can reference www.w3.org as above, or reference an internal network URL or drop the declaration all together and none of the applications perform differently. All of this is, of course, anecdotal data at best. It would be great to know for sure what is going on. It sure would :) My question: Is there ever an attempt to make an external reference to www.w3.org from either the <!DOCTYPE> declaration or the xmlns reference?
If your organization is worried about things not working if the network goes down, I think your experiments show that that worry is not well founded. I think you would be best advised not to try to strip out the references to external resources. Michael Sperberg-McQueen -- **************************************************************** * C. M. Sperberg-McQueen, Black Mesa Technologies LLC * http://www.blackmesatech.com * http://cmsmcq.com/mib * http://balisage.net ****************************************************************
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|