[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Come On, DTD, Come On! Thoughts on DSDL Part 9
/ John Cowan <jcowan@r...> was heard to say: | I am assuming that the context for extending DTDs is not redefining | XML, but rather creating an enhanced XML DTD format which can be used by an | external validator. Hmm. I imagined this work would redefine XML DTDs. If it's all for post-parsing validation, why bother? We've got RNG, XSD, and Schematron (and others). (By the same token, if the result won't be XML, why bother, but ...) | Example: | | <!NS foo SYSTEM "http://www.example.com/foo"> I've always imagined this as: <!NAMESPACE foo "http://www.example.com/foo"> There's no precedent for abbreviation in the decl name. | Issue: Is it an error to mention a prefix that is not declared? My | answer: no; if this is done, name matching falls back to string identity. If the DTD contains an <!NAMESPACE decl, yes, otherwise no. | Issue: is the keyword SYSTEM useful? I don't think so. I'm not entirely sure I think it's a good idea to use external identifiers for XML namespaces. (I also don't think there's a precedent for SYSTEM w/o allowing PUBLIC and, much as I support public identifiers, I'm not sure they work sensibly with XML namespaces (more's the pity).) | Issue: this does not help when prefixes are not used consistently | throughout an instance. Do we care? My answer: no. Sure it does. That's the whole point of the declaration! I expect the parser to understand: <!DOCTYPE foo:test [ <!NAMESPACE foo "http://www.example.com/foo"> <!ELEMENT foo:test (foo:para+)> <!ELEMENT foo:para (#PCDATA)*> ]> <bar:test xmlns:bar="http://www.example.com/foo"> <para xmlns="http://www.example.com/foo"/> </bar:test> If it's going to be done post-XML parsing as you proposed, then move that internal subset to an external file, and imagine you have two documents: example.xml: <bar:test xmlns:bar="http://www.example.com/foo"> <para xmlns="http://www.example.com/foo"/> </bar:test> example:edtd: <!NAMESPACE foo "http://www.example.com/foo"> <!ELEMENT foo:test (foo:para+)> <!ELEMENT foo:para (#PCDATA)*> I expect a validator to be able to test the validity of example.xml with example.edtd. Or maybe I'm missing something. | 2) Attribute data types. The names that can appear in an ATTLIST | declaration directly after an attribute name are extended to include | the datatype names of part 5 (i.e. XSD simple types). | | Example: | | <!ATTLIST baz | foo integer #implied | baz integer #required> | | Issue: do we need to make the datatype list extensible? If so, we could | use QNames and a DATATYPE declaration, rather like the compact syntax | of RELAX NG. Mumble. I guess I'd go with qnames and let it be extensible. Off the top of my head, anyway. | 3) Element simple datatypes. Likewise, unparenthesized content models | in ELEMENT declarations are extended from just ANY and EMPTY to include | these same datatypes. | | Example: <!ELEMENT foo nonNegativeInteger> | | 4) Datatype lists. In either #2 or #3 context, a simple datatype name | can be replaced by "LIST(name)" to indicate a whitespace-separated | list of strings matching the datatype. IDREFS is equal to LIST(IDREF), | and ENTITIES is equal to LIST(ENTITY). There's fairly limited utility in extending DTDs. I think this is starting to make it too expensive. I'm not sure I don't feel the same way about 3 and if pressed, even 2. | 5) Datatype choice. In either #2 or #3 context, a simple or LIST-wrapped | datatype name can be replaced by |-separated names, to indicate a choice | (derivation by union in WXS terms). | | Example: <!ELEMENT bar integer|name> | | Issue: what do we do about XSD facets? They are important but don't | easily fit into the rigid DTD syntax. Too much complexity. | 6) Restore & connector. Bring back the & connector, either with the | SGML semantics (A,B)|(B,A), or preferably with the RELAX NG "interleave" | semantics. The difference is that, given the content model "A & B+", | the element sequences A, B, B, B and B, B, B, A will match in either case, | but B, A, B, B will only match using interleave semantics. | | Issue: SGML or interleave? My answer: interleave My answer, don't do it. | 7) Abandon SGML 1-ambiguity rules. Instead, allow complete flexibility of | content models. See James Clark's discussion in "The Design of RELAX NG". Nope. | 8) Restore multiple element and attribute names separated by |s. | This makes for conciseness and easy authoring. These constructs were | dumped in XML DTDs because they imposed extra cost on validating parsers, | but in this model validation is something done outside parsing, so higher | cost is worthwhile. Nah, I think this is a bit of syntactic sugar I can live without. | 9) Fixed element content. Allow ELEMENT declarations to specify "#FIXED | 'value'" after a datatype. | | Example: <!ELEMENT foo integer #FIXED "5"> | | This means that the content of any foo element must be equivalent to 5 | according to the "integer" datatype's equivalence relation: therefore, | 05, 005, +5, etc. will pass validation. Nope. | General issue: Should there be some way to indicate candidate roots? | In existing DTDs, any element can be a root. Nope. Be seeing you, norm -- Norman.Walsh@S... | One should always be a little XML Standards Engineer | improbable.--Oscar Wilde XML Technology Center | Sun Microsystems, Inc. |
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|