[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: RNG vs. XSD : is the use of abstract types andpolymorphism
I like what Norm wrote.I'd add a couple of things. XML came out of a desire to reduce the bloat of SGML; I think instigators of RELAX NG harboured the same desire for RELAX NG: use a powerful model with a few constructs, add some nice sugar (RNC), and re-use external standards (XSD datatypes). This is essentially a pragmatic view, along the "wrong is right" lines. But the essential use of the schema language was for validation. The instigators of XSD had a difference set of desires: superset of DTDs, superset of inheritance, superset of datatypes plus support database keys, conversion to and from object systems, be able to model families of schemas like XHTML, and allow some kinds of internal model checking (abstract, complex type inheritance). The intent was a schema language for all seasons. The essential use of the schema language was type assignment, as in data binding: documents build from systems generated from an XSD schema would necessarily be valid. A big set of requirements leads to a big technology leads to a big chance that any particular adoption of the technology beyond simple validation would (have to) leave bits out, making it a gamble whether your system could support a schema. (XSD 1.1 improves the bang per buck over XSD 1.0, but increases the risk of not being supported.) When I look at them, I can understand both approaches, from the angle that the more complex a system is, the more that extra layers of abstraction are required. RELAX NG has five kinds of modularity (pattterns, external ref, include, define, combine) while XSD has a few more. So you would expect that the more complex your system is, that RELAX NG will run out of steam for modeling power than XSD does. However, as your system gets even more complex, XSD's modeling runs out of steam too. For example, XSD brings little to the picture when trying to reconcile dialects: it has no ability to say "this attribute in schema A is the same as this element in schema B" for example. So in order to cope with these you either need to abandon RELAX NG and XSD or build extra layers on top. (The same goes for Schematron too.) For example, the XBRL effort build modeling on top of XSD. And the result? XSD starts with so much complexity, that anything built on top that provides a superset of XSD's capabilities is liable to be bloated and require a lot of buck per bang. Again, XBRL is an example. Contrast that with RELAX NG: starting from the smaller neater base allows neater layering of modeling features above it. This is not an abstract issue. In my job I have to look at schema issues where modeling is necessary, but both RELAX NG and XSD do not provide any assistance. The difference is that RELAX NG was designed in the expectation of semi-custom extra layers, while XSD was designed in the expectation of integration into a vendor's toolchain: once you buy into the toolchain you are stuck/blessed with whatever modeling capabilities the vendor provides. Next, two minor quibbles. First I'd note that things like XSD's abstract type feature can be (easily?) fitted on top of RELAX NG, using a Schematron schema: you just check that a particular set of pattern names (LHS) are not used directly in an element declaration (RHS). (And remember that XSD's complex type derivation is used for model checking as a layer, not as a method of reducing the number of declarations.) And a check that a RELAX NG grammar is not ambiguous (something like XSD's UPA) can be added on top as well, (though perhaps not with Schematron with XPath 1 since it is not good for tracing arbitrary chains)-- IIRC there was a Japanese system that featured this. As well as the distinction between what two schema languages support directly (and how well), there is a useful distinction between what can be layered on top of them (and how well). Are these better done as a layer or monolithic? Are these better provided as a standard layer or left to the market to decide? Secondly, the type derivation by single inheritance in XSD is not the only game in town, as far as modeling. What complex type inheritance does is say that X is also valid against base type Y and all the chain of base types (with the wrinkle that for derivation by extension by suffixation in XSD 1.0, you use (Y, *) as the base type. This kind of extra check can be done on the schema, as I mentioned. But it can also be done by validating the document separately with both X and Y. And, indeed, this turns out to be a more powerful approach: for example, there is a RELAX NG schema for HTML with one schema for normal parent child validation, and another schema that handles exclusion exceptions (e.g. that an html:a cannot have another html:a underneath it at any level.) In order to do this constraint in XHTML 1.0 you (unless there is some trick with <xsl:key>or something obscure) you would have to have duplicate content types and because the global element declarations were in use you would have to use local declarations only. There are some modeling problems that are better dealt with as parallel problems, or with multiple inheritance (for which, in effect, the grammar needs to support ambiguity.) If your problem is coping with long-lived dialects, and mapping between concepts and dialects, DTDs, XML Schemas and RELAX NG don't provide any help. (Schematron's abstract patterns or XBRL or OWL are nearer, I think.) It think XSD's marketing tends to give the false promise that it is big enough to scale to large issues, when it is big enough to get in the way. RELAX NG's marketing is much clearer (indeed, a premise of ISO DSDL): solve the small problems fist and leaving the complex modeling to other layers. Finally, what is interesting to me is that for the last several years, over a variety of very large projects, I am seeing RELAX NG Compact being used by content analysts because of its terseness, then converted to XSD as needed. This avoids some of the dangerous corners of XSD, but I think the main reason is generational (people who grew up on DTDs) but also practical (being able to write a content model on a single line, being able to see all the parts of a type on a single screen, not scrabbling around diagrams, not needing to change tools or views, not hanging around for the editing tool to [expletive deleted] in all the schema and load it). So notional modeling power is one aspect, but convenience and usability (given a set of experiences and expectations) may trump power. Cheers Rick Jelliffe
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|