[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] (Correction) Re: Are we losing out because of grammars?
(oops correction and resend. apologies) From: James Clark <jjc@j...> >Whilst I think the approach used by Schematron is an valuable complement >to grammar based schemas (obviously I'm personally delighted to see >XPath getting used for validation), I really find it very hard to take >seriously the idea that the time has come to completely discard grammars >in favour of path-based rule systems. XPath and XSLT are great. >Let's take a really simple example: > ><!ELEMENT a (b?, c)> ><!ELEMENT b (#PCDATA)> ><!ELEMENT c (#PCDATA)> If efficiency and terseness is the criteria, what about: <pattern> <rule context="a"> <assert test= "*[1][self::b][next-sibling::c[position()=last()]] or *[1][self::b][position()=last()]" /> </rule> <rule context="b[* or @*] | c[* or @*]"> <report test="1=1" >Should be empty.</report> </rule> </pattern> This has 5 functioning elements compared to TREX's 6. It only require looking at the first child. (This is an example of elaborating each path, which is nasty for larger rules.) But it is not particularly the way I'd envision people will use Schematron. This can be pretty printed to give a very direct list of rules about the schema. I note again that the comprehensibility of a schematron schema comes not from its paths (though often these are simple) but because there is a pretty direct path for making everything explicit in simple natural language statements. If one element can follow another, we can explain "why". But lets try a different example, quid pro quo. This is a real one, coming from discussion on how to mark up news stories. The client gave the following requirement: "Every news story must have elements to mark up who, what, where, when and how. There must be one and only one of each in every story. They can appear anywhere." This requirement is very easy to express in words. It is also trivially easy to express in Schematron: <pattern> <rule context="/"> <assert test="count(//news:who)=1 and count(//news:what)=1 and count(//new:where)=1 and count(//news:when)=1 and count(//news:how)=1" >Every news story must have elements to mark up who, what, where, when and how. There must be one and only one in every story. They can appear anywhere.</assert> </rule> </pattern> One can take these constraints and add them to any schematron schema without change and it will work. (If the other schema is closed, then that is an internal inconsistency, which is a different matter. However, schematron schemas are open by default.) Lets say we add this to a schematron schema for full DOCBOOK. The addition in Schematron is just a single rule, and it will fit in with all the constraints already in place. It seems to me that this would cause a grammar-based schema language to explode, if it could cope at all: XML Schemas could not cope (if the schema had used <or> groups, and we have to assume that there are already "<any>" wildcards in place so these new elements are allowed anywhere.) >It seems self-evident to me that both grammars and path-based rule >systems have their place. Some problems can be solved most conveniently >with just grammars, some most conveniently with just path-based rule >systems and some most comveniently with a combination. Can't we just >leave it at that? What is the point of this crusade against grammars? Oh, it is no crusade against grammars. My point all along has been that there has never been any discussion which establishes grammars are best or don't cause more problems (for people implementing ad hoc editing systems for example) than they solve. "A life unexamined is not worth living." James, Murata-san and I (through my work with XML Schemas) are merrily foisting grammar-based systems on the world; people may perhaps expect that there is some particular reason, perhaps so obvious it never has needed to be stated, why grammars should be regarded by technocrats (even those outside Asia) as the best approach to take in precedence to others. I don't mean James or Murata-san need to justify why they take an approach, though I think the XML Schema WG perhaps should. I am suggesting that the real reason for this global phenomenon is closer to "err we haven't tried or considered anything else." There are now three credible second-generation grammar-based XML schema languages. If we include the first generation (SOX 1 &2, DDML, XML Data & XDR, DSD, etc) that gives us perhaps 10 grammar-based schema languages. If the use of the grammar:rules paradigm is over 10:2 (I include XLinkIt with Schematron) and implementation is 50:1 then people may draw the conclusion that most experts and implementors think that grammars are better than rules for schema languages, rather than that implementors are trying to find the most general and powerful replacement for DTDs within the grammar framework *without* considering the underlying issue of what is best. Cheers Rick Jelliffe
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|