[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Type and Structure Re: ASN.1 and XML
From: "James Clark" <jjc@j...> > I do not see either TREX or RELAX as trying to interpose a type > system between markup structure and the model (I assume you > mean a conceptual/semantic model). Nevertheless they do. > In fact, this is one of the big differences in philosophy I see > between TREX and RELAX on the one hand, and W3C XML > Schema on the other. Yes, I think that is a very important point. Schematron can be added to the list also with TREX and RELAX, in that it also comes from the view that adding type information to the infoset to be used for general processing adds a level of complexity (i.e. it requires schema processing rather than simple keying from the element's generic identifiers) that is utterly the wrong design for the web. And it is no concession (as one of the few people in the world to have written a book on DTDs!) for me to say that I like grammars, use them, recommend them, think they are a smart and simple way to make assertions about simple structures, etc. But my point is different. Grammars are quite handy for some structures; but they are dogs at others. Grammars with nice set-operation properties can be invented; but that won't make them any more usable than DTDs (though certainly more powerful) or XML Schemas. Having technology that can be grasped and used by the non-elite is a more worthy and difficult goal than science or elegance alone, to me (not that the latter is remotely unworthy nor difficult, of course): the rest of the XML community seems on a pro-grammar crusade that perpetuates the disconnect between concept and expression. One thinks about things in some terms, then tries to fit those thoughts in a grammar. For example, look at XML Schemas: it is entirely written based on the idea of components: but there is no "component" element or type in sight. It's grammar is just concerned with expressing what it can: the markup gives little indication of how things are grouped. A better syntax for XML Schemas 2.0 might be along these lines: <component name="HTML Table" markup="html:table" > <set> <rcdata name="caption" markup="html:caption" /> <component name="HTML body" > <set name="HTML row" markup="html:tr" > <mixed name="HTML data" markup="html:td" /> </set> </component> </set> </component> Now of course this can be reduced to a grammar. (And of course one would have to introduce occurrence indicators, if one wanted to use it for some kinds of uses.) But the point is that some non-terminal symbols in a grammar have significance as more than just conveniences. For example, lets look at HTML's meta element: there are two flavours (one with http-equiv attribute, one without.) We could have a schema language which allows both: <component name="HTML meta"> <choice> <bag name="HTML meta with http-equiv" markup="html:meta[@http-equiv]" > <string name="HTTP equiv" markup="@http-equiv" /> <string name="meta contents" markup="@contents" /> </bag> <bag name="HTML meta" markup="html:meta" > <string name="name" markup="@name" /> <string name="meta contents" markup="@contents" /> </bag> </choice> </component> In doing so, we are modeling the concepts, but in a way that is entirely amenable to conversion to validators (either to a grammar system or something like Schematron). Some conceptual structures have markup representations; some do not--but the thing that determines them is not the mechanics of constructing a grammar: they correspond to logical structures that are not (and sometimes cannot be) marked up as the element tree. Grammars should be an implementation technique, not the nub of the question. The XML Schema WG made several decisions, based on demarcation with RDF Schemas, that I think need not be followed by other schema-language developers: that bags and sets were not to be used, that links were not to be followed, that we should not start with anything other than grammars. I note the following too: if a schema language were made along the lines component lines like the one above, it would be possible to define mappings to schema languages. For example: <component name="Table" markup:html="table" markup:cals="tbl" > <set> <rcdata name="caption" markup:html="caption" markup:cals="title" /> <component name="table body" markup:cals="tbody" > <set name="table row" markup:html="tr" markup:cals="row" > <mixed name="data" markup:html="td" markup:cals="cell" /> </set> </component> </set> </component> So we model the information (i.e. we get much closer to some ER system) then we use XPaths to describe how the information maps in some XML representation. Then whether this is implemented using assertions or a grammar is not of interest to the schema user. And (at least for some significant structures) the problem of cross-mapping data (same information, different structures) between schemas goes away. I hope that makes it clearer the scope and nature to why I would lump RELAX and TREX in with XML Schemas in regard to types. Cheers Rick Jelliffe
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|