|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Are we losing out because of grammars?
±H¥óªÌ: Joe English <jenglish@f...> >Rick Jelliffe wrote: > >> I believe propronents of various schema paradigms (grammars and >> rule-based systems) need to justify that their paradigm is useful.... >I think the utility of grammar-based schemas has been well established. As far as I know, the decision to use grammars in XML Schemas was made completely uncritically. And how could it be otherwise if there has been no discussion? So the utility of grammar-based schemas is as well-established as the utility of the horse-drawn stump-jump plough: yes we can do excellent things with them that we would not do without them, but if we are not Amish why use a horse when there are shining tractors with air-conditioned cabs and diverting CDs of Dwight Yoakim yodelling and the Georgia Peach singing? If we know XML documents need to be graphs, why are we working as if they are trees? Why do we have schema languages that enforce the treeness of the syntax rather than provide the layer to free us from it? >One of the biggest benefits of grammar-based schemata is that >they are generative: there is an enumeration procedure as well >as a decision procedure. A rule-based schema can also be generative, using type inferencing. The simpler the rules-language, the simpler the inferencing required. For example, given <rule context="book"><assert text="chapter"/></rule> an inferencing interface can say when we are in a book "do you want to add a chapter?" So guiding/performing generation is not a unique quality of grammars. The difference is, that a grammar pretty much forces one to work in sibling order, while a rule-based system is more free. If a test is too tricky or too general to infer a specific action, so the plain text of the assertion becomes more important to guide the generator--this is not a weakness but a strength. (And Schematron 1.5 allows two extra hints on assertions: a role attribute and a subject path which can be used to key specific handlers for assertions. And the diagnostics for each assertion allow you to get the value of expressions, and so have dynamic error messages. Plus the phase mechanism will let you turn off sets of patterns: so there could be set of patterns (a phase) for when one is creating a skeletal document (so you only get generative hints about these) and then in a different phase you get interested in something else--current schema languages do not allow management of a division of labour or tasking for data entry.) >But there *is* an ambiguity issue in Schematron -- the requirement >that a node can match the context of at most one <rule> in >a given <pattern>. As you mentioned, this doesn't cause >a problem when trying to merge two schemas (just include all >the <pattern>s from both schemas), but the question of whether >all the <rules> within each of the original <pattern>s match >disjoint node-sets remains. I don't see that, really. If you want to merge two rule sets (unless they happen to be the same) you may get a minor explosion but there is no ambiguity. I think it is just straightforward set operations, and you always know which one will fire due because of lexical priority. So, to merge any two patterns, for each rule in pattern A take each rule in pattern B and create a new rule by anding the context attributes from the current A and B and combining their contents (and, at the end, also have all the A rules and all the B rules uncombined to handle the fallthrough case). Maintain the order; prune impossible cases if desired (they are harmless, just for performance reasons); refactor rule contexts to eliminate common subexpressions as desired. (Schematron provides a syntactic sugar facility called abstract rules, which will keep the line count under control: instead of combining the assert statements, make an abstract rule with the contents of each source rule and then add references to the abstract rule inside the generated rules.) >> XML does not have SGML's short refs and delimiter maps, so why does it >> now need grammar-based content-modeling? > >XML doesn't need *any* kind of schema language But we have them, people want/need them, and the underlying paradigm has never been explicitly justified compared to alternatives. Cheers Rick Jelliffe
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








