Conformance in XML processors
I apologize in advance for being somewhat acerbic. I think that there are areas that the PR could be more clear, in particular what gets passed to the app, but what is required by way of DTD handling is pretty crystal clear, and blatantly incorrect statements being presented as facts at this point in history is dangerous and rather irritating. -Tim At 11:10 AM 17/01/98, Peter Murray-Rust wrote: >One design goal (4 in spec) is that it should be "easy to write programs >which process XML documents". If that is interpreted that it is "easy to >write software that processes *all* XML documents, throwing errors wherever >one is required", then that goal is already lost. For example, James Clark >has come up with about 140 carefully incorrect XML documents ... and both James' processor and Lark detect all 164 errors, modulo to-be-fixed ambiguities on weird boundary conditions. I will be astounded if, in the not-too-distant future, due to input from Microsoft and Netscape, every desktop doesn't come with a couple of fully conformant XML processors built-in. Yes, I agree that we didn't do as well on that design goal as I would have liked; but the empirical fact is that the software is already there. >However, I think there will be domains where the full functionality (or at >least the full syntax) of XML will not be used. In that case there will be >"simple tools" that process XML documents. Not *all* XML documents, but a >lot. If there are widely-available fully-conformant processors which are already there in the browser and OS, why would you want to use a "simple tool" which will fail to accept conformant documents? Seems like a way to lose customers, to me. > It seems to me reasonable that these tools can tell the user if they >can't process a document. It seems highly unreasonable to me; if I create a legal XML document in my nice Frame or Arbortext or SoftQuad software, and send it to you, and you say "oooh icky, that's too complicated for poor little me" you can expect vehement and sincere complaints. >But I suspect there will be a number of tools which don't support the whole >spec I doubt it. Ooops, clarification, there will be tons of tools which don't validate. But when it is the case that both major browsers accept all conformant documents and turf non-WF docs, then there will be de facto a culture that will be intolerant of broken tools. Thank goodness. > We have frequently talked about the Desperate Perl Hacker >writing tools which are sufficient to process a class of XML documents, but >not all. Yes, but they don't claim to be XML processors. And that's just fine. >A Document + DTD + request to validate document. Requires a validating parser. Right. >B Document + full DTD but no request to validate. Right. We assume this document is WF, right? >C Document + parts of a DTD (e.g. a few ELEMENTs and ATTLISTs, maybe an >external subset which covers some of the ELEMENTs in the document). If no request to validate, the fact of missing <!ELEMENT declarations is not required to have any effect, and applications must not depend on any behavior contingent on the processing of an <!ELEMENT or <!NOTATION declaration. >D Document with no internal or external subset. Can only be well-formed. Right. >What the difference between A and B is is not clear to me. Only the request to validate. Lots of WF docs will in fact be valid, but be called WF simply because some app has no need to validate. >Note that Lark and AElfred both throw errors for ><!DOCTYPE FOO SYSTEM "bar.dtd"> >if bar.dtd cannot be found. No. If you do lark.processEternalEntities(false) then it won't try to fetch the DTD. (Since "file:" URL's are in general a pool of blood on Microsoft operating systems, I recommend doing this most of the time). >C is similar to B, but validation is not possible. It is *essential* that >if ATTLISTs and ENTITYs (and NOTATION) exist, then the information in them >MUST be applied to the document. No. The spec is clear; a non-validating processor is required to do internal entities and default attribute values. Nobody should expect one to do anything with notations or unparsed entities or anything else. You want that, get a validating processor. >*IFF* an ENTITY is declared (case C), the parser MUST process it. If it's a non-validating processor, this is only true for *internal* entities. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format