[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Are we losing out because of grammars? (Re: Schema ambiguitydetection a
±H¥óªÌ: K.Kawaguchi <k-kawa@b...> >While XML Schema gains more and more weight, we have two relatively >light-weight schema language. One is RELAX, and the other is TREX. >And I think many people is curious about the difference of these two >languages. .. >I'd appreciate any comments from you about this. Without disrespect to either RELAX or TREX (and certainly not for the people involved, who have my highest respect as innovators and activists), my serious comment is "why bother?" Of course, I don't mean that this is not academically interesting nor commercially relevant, nor that is not vital for getting efficient implementations, nor that it is not important that we have strong alternative schema languages for rich diversity and cross-pollenation. What I mean is this: ambiguity is an artifact of the grammar paradigm not of schemas per se. In this, it is the same as studies of how to get the union of schemas. In other words, RELAX may be lightweight for implementing, but not lightweight for reasoning (since grammar-based schema languages are impoverished in the direct relationships they can express). What if, even after figuring out how to handle ambiguity and unions in grammars, we are still left with a paradigm that is not expressive enough for implementing the human-oriented, concept-modeling/data-modeling systems that some people think are important? I believe propronents of various schema paradigms (grammars and rule-based systems) need to justify that their paradigm is useful. When we only had grammars, it was a moot point. Now we have Schematron and other rule-based systems, there I think we can be bold enough to start to critique grammars-as-schemas. (Of course, this is a two-way street.) Lets contrast the grammar approach to a rule/path based approach (as in Schematron). In Schematron we don't have any ambiguity problem, because we don't have alternate paths for any location expression in the same way. The union issue is a bit more tricky. Naively, one merely sticks the patterns from both schemas into one schema by cut and paste, and they are evaluated as separate patterns. There is no need to merge the patterns (just like TREX avoids the problem, I think, by allowing parallel content models). An inferencing system could be made which checked whether each assertion totally or partially conflicted with another and which assertions could be joined: some notion of which contsraints could be relaxed would be important too: if schema 1 says count(*)>5 and schema 2 says count(*) is > 6, what is the result of "merging"?--in that case there are a universe of possible merges, with some being more interesting than others. So my comment is this: doesn't the presence of these tricky ambiguity issues mean that to actually understand RELAX (and presumably certain other schema languages) requires a computer scientist not a data modeler? Might we get to higher-level schema languages faster by completely ditching the grammar paradigm and treating schemas as systems of logical assertions which can be queried? XML does not have SGML's short refs and delimiter maps, so why does it now need grammar-based content-modeling? Cheers Rick Jelliffe
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|