[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Is Schematron (using XPath 2.0) functionally asuperset of
On Sat, 2007-11-10 at 11:58 +0100, Michele Vivoda wrote: > Hi, > > I use the schema also for its data-description > features: to know which element/attrs are allowed, > datatype etc, before validation, > to _produce_ instances, that then I will validate. > I think that it would be hard to > get the same description from a list of xpaths, > or at least I wouldn't know where to start. Where to start? Well, forget grammars and learn XPaths! More seriously, apart from the issue of expressive power (which Schematron w XSLT2 wins AFAICS) I would never dispute that some things are easier to express in Schematron and some things are easier to express in XSD; indeed, some things are easier to express using XPaths and some things are easier to express using grammars. However, you set yourself a hard task if you are defending XSD because of its user-friendliness :-) Why is it particularly harder to write a flat set of declarations <xsh:pattern> <sch:rule context="AAA"> ... </sch:rule> <sch:rule context="AAA/BBB"> ... </sch:rule> </sch:pattern> than a nested set? <xsd:element name="AAA"> <xsd:complexType> <xsd:sequence> <xsd:element name="BBB"> ... </xsd:element> ... </xsd:sequence> </xsd:complexType> </xsd:element> Indeed, if you are processing using XSLT or DOM or most object models, you are probably using XPaths to locate the information items you are interested in. You already, quite probably, are XPath processing: why isn't validation merely double handling? It is not impossible to imagine a language like SAXON doing dynamic typing of data items by using XPath matching, rather than implementing an FDA (especially because of that declaration rule I forgot last week.) I don't see why validating against an FDA then using XPath matching is not really a kind of double-handling: lazy type implication using XPaths would seem to be a respectable implementation technique that in some situations would give good performance. > Is it possible to go back from the schematron > to the w3c schema ? The area of information modeling, rather than just expression of bags of constraints, is of course quite important. (Whether it is so important the it should trump Plain Old Assertions is another matter.) IMHO Schematron actually has a more powerful story than XSD in regard to modeling. Now Schematron does have a built-in set of component-equivalents, based on my view of the problems of DTDs (you might say that XSD embraced DTDs to the extent of wanting to explicate them, while I thought they were an inadequate basis.) Hence phases -> patterns -> rules -> assertions -> diagnostics. However, the thing that I consider more powerful is the provision of abstract patterns (and abstract rules). These allow a schema designer to invent and implement their own modeling system; their own abstractions which they may check using XPaths or their own software or just leave as formalized statements that have no computerized checking. For example, using abstract rules, the following: <sch:rule abstract="true" id="t2" role="xsd-simpleType" > <sch:rule extends="t1" /> <sch:assert test=". < 4" role="xsd-facet-minExclusive"> The value should be less than 4 </sch:assert> </sh:rule> has all the information needed to generate <xsd:simpleType name="t2" base="t1"> <xsd:minExclusive value="4" /> </xsd:simpleType> and to round-trip. You say xsd:tomato and I say sch:tomato. As BR pointed out, it also has enough information that a type-annotating system could use this to annotated DOM items with type (and derivation) information. For content models, abstract patterns come into play. You could have an abstract pattern usage such as <sch:pattern is-a="element-grammar"> <sch:param name="element" value="AAA" /> <sch:param name="grammar" value="' BBB, CCC+, DDD | EEE, FFF? '" /> </sch:pattern> which could then have a vacuous implementation* <sch:pattern abstract="true" name="element-grammar"> <sch:rule context=" $element " > <sch:assert test="true()">The element <sch:name/> should follow the content model <sch:value-of select=" $grammar "/> </sch:assert> </sch:rule> </sch:pattern> So it is not true to say that you cannot represent all the information in a Schematron schema that would be needed to round-trip to an XSD schema: it is just a matter of conventions and support software. Schematron is one step ahead of XSD in that while XSD provides a container (appinfo) in which a similar mechanism could be constructed, Schematron provides a way to give such user-defined constructs a name, parameterize the declarations, give the parameters names, and pass through user-friendly text. Interstingly, it is entirely possible to use Schematron abstract patterns to make up your own schema language, which you may then implement yourself but not using the Schematron XPaths mechanism as all: you just use Schematron as a vocabulary for driving your own modeling application. Schematron has this basic extensibility that XSD (and other schema languages that don't provide an explicit mechanism) do not. Can we have our cake and eat it? Can we express the grammar in a way suitable for round-tripping and also express the constraints in a way that Schematron can use it to validate? Here is an example of this: <sch:pattern is-a="element-grammar-1"> <sch:param name="element" value="AAA" /> <sch:param name="grammar" value="' BBB, CCC+, DDD | EEE, FFF? '" /> <xsd:param name="children" value=" BBB | CCC | DDD | EEE | FFF " /> <xsd:param name="children-text" value="' BBB or CCC or DDD or EEE or FFF " /> </sch:pattern> and then implement some tests <sch:pattern abstract="true" name="element-grammar-1"> <sch:rule context=" $element " > <!-- untested assertion --> <sch:assert test="true()">The element <sch:name/> should follow the content model <sch:value-of select=" $grammar "/> </sch: <!-- tested assertion --> <sch:assert test=" count( $children ) = count(*)"> The element <sch:name/> should only have children from the following list: <sch:value-of select=" children-text"/> </sch:assert> </sch:rule> </sch:pattern> Of course, we could have software that automagically generated the extra parameters from the first form of abstract pattern: the information is all there. Anyway, the bottom line is that Schematron allows both the abstract specification of all sorts of constraints (with a mechanism of named parameters, which is more than XSD provides) and the specific testing of many kinds of constraints (using XPaths, which are more powerful than what XSD provides.) So the gap is in the middle, in developing conventions and software to implement the home-made schema-models that users may develop. XSD uses grammars because XML DTDs used grammars. XML DTDs used grammars because SGML DTDs used grammars. SGML DTDs used grammars because the delimiter map could change while parsing an element: short references in particular. So for SGML, grammars were both a necessity and wonderful low-hanging fruit. But for XML, we have no technical requirement that forces us to use grammars to model: we can explore and figure out the best mechanism for our tasks. Grammars are familiar (though not to DB people); they are terse (though not in the XSD syntax); they are powerful (though not in the XSD model); they are easy to implement (but not with all the XSD extras); but schema != grammar. Any grammar is a round hole, but sometimes you have a square peg. And XSD is sometimes a very tight hole indeed. Cheers Rick Jelliffe * http://www.oreillynet.com/xml/blog/2007/03/expressing_untested_and_untest.html
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|