[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Come On, DTD, Come On! Thoughts on DSDL Part 9


dtd datatype
/ John Cowan <jcowan@r...> was heard to say:
| I am assuming that the context for extending DTDs is not redefining
| XML, but rather creating an enhanced XML DTD format which can be used by an
| external validator.

Hmm. I imagined this work would redefine XML DTDs. If it's all for
post-parsing validation, why bother? We've got RNG, XSD, and
Schematron (and others).

(By the same token, if the result won't be XML, why bother, but ...)

| Example:
|
| <!NS foo SYSTEM "http://www.example.com/foo">

I've always imagined this as:

 <!NAMESPACE foo "http://www.example.com/foo">

There's no precedent for abbreviation in the decl name.

| Issue: Is it an error to mention a prefix that is not declared?  My
| answer: no; if this is done, name matching falls back to string identity.

If the DTD contains an <!NAMESPACE decl, yes, otherwise no.

| Issue: is the keyword SYSTEM useful?

I don't think so. I'm not entirely sure I think it's a good idea to
use external identifiers for XML namespaces. (I also don't think
there's a precedent for SYSTEM w/o allowing PUBLIC and, much as I
support public identifiers, I'm not sure they work sensibly with XML
namespaces (more's the pity).)

| Issue: this does not help when prefixes are not used consistently
| throughout an instance.  Do we care?  My answer: no.

Sure it does. That's the whole point of the declaration!

I expect the parser to understand:

<!DOCTYPE foo:test [
<!NAMESPACE foo "http://www.example.com/foo">
<!ELEMENT foo:test (foo:para+)>
<!ELEMENT foo:para (#PCDATA)*>
]>
<bar:test xmlns:bar="http://www.example.com/foo">
<para xmlns="http://www.example.com/foo"/>
</bar:test>

If it's going to be done post-XML parsing as you proposed, then move
that internal subset to an external file, and imagine you have two
documents:

example.xml:
<bar:test xmlns:bar="http://www.example.com/foo">
<para xmlns="http://www.example.com/foo"/>
</bar:test>

example:edtd:
<!NAMESPACE foo "http://www.example.com/foo">
<!ELEMENT foo:test (foo:para+)>
<!ELEMENT foo:para (#PCDATA)*>

I expect a validator to be able to test the validity of example.xml
with example.edtd.

Or maybe I'm missing something.

| 2) Attribute data types.  The names that can appear in an ATTLIST
| declaration directly after an attribute name are extended to include
| the datatype names of part 5 (i.e. XSD simple types).
|
| Example:
|
| <!ATTLIST baz
| 	foo integer #implied
| 	baz integer #required>
|
| Issue: do we need to make the datatype list extensible?  If so, we could
| use QNames and a DATATYPE declaration, rather like the compact syntax
| of RELAX NG.

Mumble. I guess I'd go with qnames and let it be extensible. Off the
top of my head, anyway.

| 3) Element simple datatypes.  Likewise, unparenthesized content models
| in ELEMENT declarations are extended from just ANY and EMPTY to include
| these same datatypes.
|
| Example: <!ELEMENT foo nonNegativeInteger>
|
| 4) Datatype lists.  In either #2 or #3 context, a simple datatype name
| can be replaced by "LIST(name)" to indicate a whitespace-separated
| list of strings matching the datatype.	IDREFS is equal to LIST(IDREF),
| and ENTITIES is equal to LIST(ENTITY).

There's fairly limited utility in extending DTDs. I think this is
starting to make it too expensive. I'm not sure I don't feel the same
way about 3 and if pressed, even 2.

| 5) Datatype choice.  In either #2 or #3 context, a simple or LIST-wrapped
| datatype name can be replaced by |-separated names, to indicate a choice
| (derivation by union in WXS terms).
|
| Example: <!ELEMENT bar integer|name>
|
| Issue: what do we do about XSD facets?	They are important but don't
| easily fit into the rigid DTD syntax.

Too much complexity.

| 6) Restore & connector.  Bring back the & connector, either with the
| SGML semantics (A,B)|(B,A), or preferably with the RELAX NG "interleave"
| semantics.  The difference is that, given the content model "A & B+",
| the element sequences A, B, B, B and B, B, B, A will match in either case,
| but B, A, B, B will only match using interleave semantics.
|
| Issue: SGML or interleave?  My answer: interleave

My answer, don't do it.

| 7) Abandon SGML 1-ambiguity rules.  Instead, allow complete flexibility of
| content models.  See James Clark's discussion in "The Design of RELAX NG".

Nope.

| 8) Restore multiple element and attribute names separated by |s.
| This makes for conciseness and easy authoring.	These constructs were
| dumped in XML DTDs because they imposed extra cost on validating parsers,
| but in this model validation is something done outside parsing, so higher
| cost is worthwhile.

Nah, I think this is a bit of syntactic sugar I can live without.

| 9) Fixed element content.  Allow ELEMENT declarations to specify "#FIXED
| 'value'" after a datatype.
|
| Example: <!ELEMENT foo integer #FIXED "5">
|
| This means that the content of any foo element must be equivalent to 5
| according to the "integer" datatype's equivalence relation: therefore,
| 05, 005, +5, etc. will pass validation.

Nope.

| General issue:	Should there be some way to indicate candidate roots?
| In existing DTDs, any element can be a root.

Nope.

                                        Be seeing you,
                                          norm

-- 
Norman.Walsh@S...   | One should always be a little
XML Standards Engineer | improbable.--Oscar Wilde
XML Technology Center  | 
Sun Microsystems, Inc. | 

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.