[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: DOCTYPE (was Re: Announcement: SAX 1998-01-12 Draft)
Peter Murray-Rust wrote: > In conclusion (IMO) the DOCTYPE statement really only serves to > identify > the address of the external subset. It is equivalent to: > > <!DOCTYPE FOO [ > <!ENTITY % foo "foo.dtd"> > %foo; > ]> Exactly correct. The DOCTYPE declaration tells you *nothing* about the abstract type of the document (that is, the general class of documents of which the author intended it to be an instance. > How do we determine the TYPE of a document? There is no good > mechanism. Not true. All that is necessary is to provide some way to point to a separate definition of the type. The SGML architecture mechanism, defined in ISO/IEC 10744:1997 and implemented in the SP parsers (as well as in purpose-built code) provides just such a mechanism. In December, James and I submitted for WG4 approval an enhancement to the formal mechanism that lets it be used with XML documents. See "http://www.ornl.gov/sgml/wg8/document/1957.htm". The idea is a simple one: you use a PI to associate a local name for the "type" and then use a URL or public identifier to point to the documentation and the DTD that defines the type. For example, ISOGEN has defined for its own use a base architecture from which a variety of specific document types can be derived. I can invoke the use of this architecture like so: <?XML 1.0 ?> <?IS10744:arch name="ISOBase" public-id="+//IDN isogen.com//NOTATION ISOGEN Base Architecture//EN" dtd-system-id="http://www.isogen.com/ISOBase/isobase.mdt" ?> <foo ISOBase="paragraph">Foo is now clearly a kind of ISOBase paragraph</foo> By default, the architecture ("type") name is used as the name of the attribute you use to map local elements to element types in the architecture (which types you can determine by looking at the architectural DTD). Note that the presence or absence of a DOCTYPE declaration is irrelevant--all the information you need to interpret the Foo element as an ISOBase paragraph is in the instance. The only think a DOCTYPE declaration would add would be the convenience of setting a default value for the ISOBase attribute. Note also that it requires no parser-level code to interpret and support the mapping because it's using normal XML syntax: PIs and attributes. It also doesn't require anything like the colonized names because the name mapping is done through an attribute, which has the advantage that the same element can be mapped to different architectures at the same time. For example, I might want to also indicate that the Foo element corresponds to something in the RDF spec: <?XML 1.0 ?> <?IS10744:arch name="ISOBase" public-id="+//IDN isogen.com//NOTATION ISOGEN Base Architecture//EN" dtd-system-id="http://www.isogen.com/ISOBase/isobase.mdt" ?> <?ISO10744:arch name=rdf public-id="+//IDN w3c.org//NOTATION Resource Definition Format//EN" dtd-system-id="http://www.w3c.org/RDF/rdf.dtd" ?> <foo ISOBase="paragraph" rdf="some-rdf-element-type" >Foo is now clearly a kind of ISOBase paragraph</foo> When you're doing ISOBase-related processing, you ignore the RDF mapping and when you're doing RDF-related processing you ignore the ISOBase mapping. Or, you can consider both at once, it's up to your processor. The document can be validated against either of the architectural DTDs by using a tool like SP, which has that facility built in, or by explicitly generating the document that reflects the mapping and then validating it against the architectural DTD. For example, the ISOBase "architectural instance" of the above is: <?XML 1.0?> <!DOCTYPE paragraph SYSTEM "http://www.isogenc.com/ISOBase/isobase.mdt"> <paragraph>Foo is now clearly a kind of ISOBase paragraph</paragraph> That's all there is to it. The idea that DOCTYPE declarations tell you something useful is one of the top five Big Lies of SGML. For more on the subject of architectures, see "http://www.isogen.com/papers/archintro.html", which goes into more detail about using architectures within an XML context. If anyone would like to see real code that does architecture-based processing, I would be happy to provide it in any of the languages in which I've done it (Perl, Rexx, DSSSL, ACL, VisualBasic--sorry, no Java, only because I haven't had a need to do Java programming yet--note the preponderance of *interpreted* languages in this list :-). Cheers, Eliot xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|