[Home] [By Thread] [By Date] [Recent Entries]
Fraser wrote: > In many circumstances both XSD itself > and validation against XSD is just too brittle. For what it is worth, here is my two cents concerning schemas for long-lived or open or multi-party or public data: * Namespaces URI should never identify versions of schemas, but general semantic area and controlling authority. They should be enough to determine which class of application or plugin is appropriate. THEREFORE software applications need to use more than the namespace to identify the schema and software. This analogous to CODECs for media applications: the user see JPEG or MPEG etc, but that just invokes a system that sniffs the data and selects (or downloads) the appropriate CODEC. * All schemas and documents should have separate version numbers, which feature minor and major numbers for the schema: if the old schema can be derived by restriction from the new schema, it is a minor version, otherwise it is a major version. THEREFORE all documents that were valid against a schema with a lesser minor number will be valid against the schema with a new minor number (given the same major number.) * Documents should use the lowest major and minor number that corresponds to the features actually used in the document. THEREFORE no document will be unneccessarily rejected where the receiver was an older system. * Where the application space has competitive standards, or the documents are compound, or where the standards are not stabilized, or where there are plain and fancy alternatives, or where there is churn and evolution, the IS29500 Part 3 Markup Compatibility and Extension (MCE) mechanism should be adopted. This allows alternative sections, marked up by namespace URL, with a must-understand mechanism; for any public material, at least one choice in a plain form should be used (e.g. as well as supporting SVG, support JPEG alternatives.) THEREFORE the receiver can choose the optimal client. * Where there are multiple versions of a standard, there should be a Schematron schema made which can report unambiguously which version was used. In other words, the evolution of the schema should be schematified. THEREFORE a client can base its decision to process based on relevant changes in the schema, i.e. on elements actually found, not irrelevant ones: the schema may have changes in ways that are irrelevant to the application. * Use Schematron patterns to abstract out and model commonality between major numbers and the variations for minor numbers. * Use Schematron phases to abstract out and model the evolution of the schema through major and minor numbers. Terminal applications in a processing graph should select the phases or patterns to validate incoming data on based only on needs and ignore irrelevant constraints. Part of the problem of versioning is that the standard schema languages were made with little practical thought about versioning. SGML DTDs had a workable marked section system that provided a measure of support for modeling variants in the same schema document, but not as first-class objects. XSD has its notions of type derivation, but they are fragile and weak (derivation by extension!) and interact poorly with other parts of the language (UPA, etc). DSDL has a story (DSRL for token changes, NVDL for modularity, XProc for smarts like versioning) but not a reality yet (SProc is only just out of the oven.) Only Schematron has first-class objects (patterns, phases) with enough power to model schema evolution. And even in Schematron the schema probably needs to be written with the intent that the schema can be further evolved. The problem is not schema languages: it is the unprofessionalism of schema professionals, if I may be uncomfortably frank. We make schemas without serious thought to maintenance; and because of this we choose schema languages which do not have any serious support for evolution (and, even if we do use Schematron, we don't organize them so that systems using them will be written to cope with change.) Let me be more challenging: if you (the schema developer) adopt a schema language which has no real support for evolution, then of course you will eventually have trouble: you have dug the hole yourself. Now perhaps I might be accused of blaming the victim, but if you choose to adopt an inadequate schema language, which has had this inadequacy publicized for years, you cannot consider yourself a victim of the inadequacies of XSD etc. They are what they are, and you don't have to adopt them! Cheers Rick Jelliffe
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |

Cart



