[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Does DTD validation work with namespaces?
I think I'm going to drop out of this thread again, as it's rehashing ground that has been VERY thoroughly covered in the past... Namespaces are NAME spaces. That's all. Folks read all sorts of other implications into namespaces, and everyone has their own ideas about what they want to do with namespaces, but namespaces themselves (a) are just naming and (b) make no particular effort to be compatable with DTD validation. There was a fairly explicit assumption that if you were using Namespaces, you would either work with well-formed documents or with a namespace-aware schema language, _NOT_ with DTDs. If you insist on mixing the two, and it doesn't work well, the response is going to be "we know." Namespaces don't do anything to make documents which mix seperately defined tagsets easier to validate. All they do is improve your ability to tell which set a given tag was intended to belong to. You can use them to tell whether Gettysburgh Address is a speech, a point in memory, or a place you can send mail to... but determining which of these makes sense at any given point in your document is not Namespaces' problem. DTDs are not aware of namespaces. They look only at the QName. Thus namespaces gain you nothing with respect to DTD-based validation, and in fact their declaration that the QName is not the "real" name of the element makes DTDs fairly explicitly the wrong tool for Namespaced documents. You can force-fit these, but in general it Really Isn't worth the effort. (I've done so as a stopgap, but I expect to discard those inadequate DTDs as rapidly as possible in favor of namespace-aware schemas.) Remember that DTD/Schema validation is OPTIONAL, and was never intended to be a complete solution. If you really intend to arbitrarily intermix elements from multiple tagsets, the answer may be that none of these content modelling languages is adequate to capture all the logic describing what's permitted where and when. In that case the right answer is to stick with well-formed documents, and move the "validation" logic back into your application. If the tagsets have a clear boundary between them -- in other words, if you aren't attempting to permit completely arbitrary interleaving of languages -- there's also the solution of using ANY in selected places and having your application logic explicitly check those to apply more intelligence at those specific points. Other validation schemes are certainly possible; that's a large part of what the XML Schema effort is all about (and Relax, and others that have been proposed). In some of those, it _is_ possible to do more specific things than ANY while still allowing some kinds of language intermixing. But the more specific your validation constraints are, the less flexibility you have... and the more flexibility you have, the less meaningful validation is. That's probably unavoidable given that these are validating strictly against the syntax of the document, and don't understand its semantics. Again, if you need semantic awareness, that has to be done in your application. Basically, the "publisher's gumbo" model of saying "I want to be able to drop MathML takes anywhere inside an SVG graphic embedded anywhere within an HTML document -- or vice versa" just doesn't fit the concept of Validation in the first place. If you really want the parser to provide some validation assistance, you need to give it a specific grammar for the document. If you can't describe that grammar precisely, validation can't do anything for you. Yes, XML Schema leaves some things out, entities being the most-cited case. I don't know that I have an opinion on this, though I consider the current handling of entities a bit inconsistant. The workaround is to use DTD syntax to declare your entities, then run schema validation against the resulting Infoset. (Note that you _can_ validate against both DTD and schema, though I presume that will be rare.) Whether that workaround _should_ be necessary is something to take up with the Schema working group, though it's getting rather late to do so. ______________________________________ Joe Kesselman / IBM Research
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|