[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Re: [namespaceDocument-8] 14 Theses
From: "Ronald Bourret" <rpbourret@r...> > In thinking about using the machine readable parts of a RDDL document at > run time, I think schemas are very useful if they can be used in a > modular fashion. I think this mention of "run time" is important, because there is no single "time of running" for dynamic documents being passed around systems. Or, better, a run time may be split into many phases, and at each separate phase some different specific constraints apply. A schema language will tend to specify as required constraints all the constraints that are supposed (by foresight and designer's fiat) to be true at every phase, and to specify as optional constraints the constraints which may not be true at any phase. This is a fundamental flaw of schema languages (except for Schematron, which has a specific phases mechanism). So schema languages will typically have a workaround mechanism: they will have two public identifiers--a persistant one to let you know the genus and a specific one to let you know which schema to use in the particular phase (this is is of course the SGML Formal Publid Identifier versus System Identifier split, which we can see in the schemaLocation attribute of XML Schemas too, for all intents and purposes.) RDDL's flaw, too, is that it does not provide any built-in mechanism for supporting phases AFAIK. One can indeed have multiple RDDL files for the different phases, and name them by putting them at different loacations. But like XML Schemas, DTDs, Examplotron, and RELAX NG (corrections to this welcome!), there is no way to manage the different variants, or even to say that one is a variant of another. For publishing, the CATALOG format has been developed to allow all the different path remappings at a particular phase to be bundled together, but still there is no idea of phases. For publishing, where there is often a division of labour in the markup team, the lack of phases has made specialization more difficult: the table queen cannot say "just validate the tables, don't give me validation errors about the metadata--we know we have not completed that yet!" If we consider schemas as a software engineering technique, as specialized languages for black-box testing of a pipeline of processes, then without some phases mechanism, it may be impractical to validate at each incremental step. We can make up a different DTD for each step, but then we need to rewrite the DOCTYPE declaration, and if we change a content model in some way that is invariant throughout the pipeline, we will need to change each particular DTD. XML Schemas has more targetted mechanisms for extension, restriction, and importing that DTD's parameter entities, which provide a single mechanism that covers a zillion cases, so these should make life a little easier for deriving individual schemas for different stages of a pipeline. But still it is clunky because the constraints are not gathered together and named by their phase. I think a lot of the discussion about whether namespaces are enough to process documents misses out on that processes can augment documents with new infoset items as well as passively swallow the infoset. Also, that there may be house rules about which elements are required or optional: I know of a banking sector case where every institution uses the same namespace and elements in an application but each bank requires a different selection of elements: you have to validate each document against each bank's schemas to know if it contains the right information items. When there are augmenting processes, any schema (or schema umbrella) that does not support phases can only capture the document as a system of variants and invariants that hold for the total pipeline or some particular point or range of the pipeline. Typically this will be the form deemed suitable for public exchange. So, back to RDDL, a document type or namespace may need a PUBLIC RDDL, declaring the end-to-end variants and invariants or the invariants at a particular point for optimal public interchange, but there also need to be phase-specific, system-specific RDDLs. The TAG group, when thinking about namespaces, may find it useful to be very clear when their statements apply to end-to-end or public uses of namespaces and system-specific or phase-dependent uses of namespaces. Cheers Rick Jelliffe P.S. In general, let us guess that in about 60% of cases, a namespace+name is enough to know to process an element. In 20% of cases we will need to know the parent. And in 20% of cases, we need to know the value of an attribute too. Maybe 1% of cases (lists) we need to know whether it is the first element or not. The latter two cases are often hidden in procedural code, so people can easily think (e.g. Tim BL's comments) that only namespace+ name+parent is enough to process an element. (Indeed, this is re-inforced by XML's Schemas lack of support for attributes).
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|