[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Schematron versus Godzilla ( was Re: xml schema)
From: "Dare Obasanjo" <dareo@m...> > Interestingly enough the X.12 Reference Model for XML Design also > prohibits using xs:appinfo elements which means that they practises like > embedding Schematron in W3C XML Schema is against their rules. I find it > rather ironic that you are touting the document as a guideline of how to > use W3C XML Schema. Err, it would be more ironic if I ever had the view that Schematron (or any other schema language) was the ONE TRUE WAY :-) There are many good reasons why people might decide not to use Schematron in certain circumstances: * it does not work streaming: it may require some kind of tree and multiple passes * it does not integrate currently with WXS' type system (though that may come), * it is not supported by many commercial tools (though it has more open source support than any other schema language) * it is unconcerned with storage issues or defaulting * people who don't know XPath but are expecting a DDL equivalent find XPaths complicated and scary, and the ability to model many kinds of "business rules" may be regarded as a weakness rather than a strength (i.e. they may feel that Schematron is not really a "schema" language at all!) * (the big one!) people are so caught up/freaked out by XML Schemas that they fear that the additional stage of Schematron constraints only adds to the conceptual complexity. And, of course, there are many good reasons to use it: * more simple * more powerful * clearer diagnostics * validate different kinds of things to grammar-based schema languages * easier (to use by people who do have XPath expertise) * easier to implement (in fact, trivial to implement) * works with most mainstream XSLT tools (such as MSXML 4) * less abstraction and more use of general-purpose constructs * you only need to express the constraints that your other schema languages (DTD, RELAX, WXS, etc) don't have, allowing a "best of both worlds" approach where there is a standard universal schema (in some domain) and "localized" versions to suit the individual use cases--this simplifies management * since there is no reason that an arbitrary data model forms a natural tree (though there is often some nice tree structure that can be implied), there is little reason to expect that a tree-based schema language (e.g. regular grammars) can fully or nicely represent all the relations * Schematron allows very idiomatic constraints to be expressed, rather than imposing an idiom (contrast with XML Schemas, which does not allow attributes to constrain content models) * The phases mechanism supporting progressive validation * people may indeed want to validate some "business-rules" at the same time as they validate static structural rules. Personally, I much prefer the DSDL approach of a simple framework language invoking different little languages, rather than the XML Schema-ish approach of <appinfo>. The trouble with <appinfo> is that is assumes that other schema languages are based on types or annotate the conceptual structures in XML Schemas: there is no equivalent in XML Schemas to a Schematron "pattern". So an embedded Schematron schema (using <appinfo> annotations on element declarations) may be useful and have practical advantages, but it is only a fairly limited use of Schematron. In the long run, I expect XML Schemas will move to a more modular framework, align its grammars with RELAX NG, factor out things that are niche requirements (e.g. nil) into modules, factor out syntactic sugar, and allow constraints expressed using fundamentally different paradigms such as Schematron. My understanding is that the XML Schema WG rather sees the value of XML Schemas as laying in the components and type lattice, far more than in the particulars of syntax or organization. I cannot see why in the long run everyone cannot win: the X.12 profile shows that, at least for a significant bunch of users, XML Schemas goes too far, and that layering XML Schemas to start with a RELAX NG-sized core is enough for 80/20. Lets take that as the core, then use a modular framework so that the people who need types can have them, the people who need nils can have them, the people who need co-occurrence constraints can have them, and the people who need patterns can have them. So it is no bad reflection on Schematron if one organization feels that it is not suitable for them, any more than it is a bad reflection on HTML if I write something using DOCBOOK. X.12 want to define the minimal WXS approach; it seems very consistent to me that if they don't recommend user-defined datatypes or substitution groups, that they also would not recommend annotations. I see it as a matter of layering: they are more or less saying "the bottom line is this conservative subset of WXS" which is pretty much a RELAX NG subset too. The more conservative the XML Schema is, the more that Schematron will emerge as being appropriate as a nice layer for additional constraints, supplementing (and simplifying) the WXS schemas. In the longer term, I think Schematron fits into process-centered Software Engineering practises such as "quality circles". There needs to be a process in place whereby downstream constraints can be fed back to upstream validators, or where upstream detectors can feed forward to downstream implementers. Maybe best practise will emerge that software tools need to be in place (from the outset!) to allow agility and rapid response to feedback: this kind of constraint validation is very different from the things that X.12 is concerned about. In otherwords, it is good to go ahead with conservative WXS controlled by corporate or industry standards, but also put in place a Schematron validator to allow agile local constraints (such as "we don't accept this element" or "this envelope must be addressed to us" or "invoices of that amount are not supposed to come here" or "don't want to allow content of more than a certain size because we have a buffer overrun problem in our software, doh" or "these jokers are sending bad data; we need to check it more" or "these constraints are suitable for people sending data who are setting up their systems, and need to get accurate diganostics before going live" ) Cheers Rick Jelliffe
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|