[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: A single, all-encompassing data validation language - good
On 2007-08-07 00:11:44 -0400 Michael Kay <mike@s...> wrote: >> I do not find it difficult to imagine a grammar that can specify >> these >> constraints; using the set-notation formalisms common in discussions >> of >> automata, it's relatively straightforward (handling the various >> gregorian >> exceptions gets increasingly difficult and verbose as one grows more >> and >> more precise, but it is not inherently beyond the scope of a grammar >> ... is >> it?). > > I've no idea what the theoretical limits of what you can do with a > grammar > are (I don't even know what the accepted definition of "grammar" is), > but I Err. Murata-san has a very nice paper that characterizes the sort of grammar embodied by several available grammars for XML, including W3C XML Schema (which is characterized as a "restrained-competition" grammar). Are you speaking of "grammar" used colloquially? Because--disclaimer, I am a humanities-trained sort just now investigating what amounts to Computer Science 101, via texts on formal languages, automata, and discrete mathematics--there does seem to be a fairly clear definition. A grammar of a particular type (that is, with particular rules for what it can define) defines a particular class of languages, and in turn is strongly associated with a class of automata. DTDs define a grammar for a regular language, and I believe that this was the design point of W3C XML Schema as well--a language that could be parsed by a deterministic finite automaton (WXS missed that, as a consequence of complex interactions, as Murata-san points out in the paper on taxonomy of schema languages). RNG has regular hedge grammar as its design point (and hedge automata). Context free grammars define a class of languages that includes regular languages (every regular grammar is a context free grammar), but with more power, and are associated with pushdown automata. It's very likely that most W3C XML Schema parsers are implemented in this fashion, rather than as deterministic (or equivalently, nondeterministic) finite automata. If that *is* the case, then it seems (to me) rather sensible to make use of the power of that style of automata more fully, rather than providing a sort of "escape" mechanism to break out of the automaton (which effectively puts you into turing machine territory, with an implementation problem of unknown complexity). > think I have a feel for the practical limits of when grammar ceases > to be > the most convenient way of stating the rules. I see people sometimes > doing > things with regular expressions that in my view are well beyond that > limit. That's a particular case of grammars. Regular language == regular expressions == regular grammar == deterministic and nondeterministic finite automata. Try and apply regular expressions to paren matching, and joy will not be your reward. You can't express "string over Sigma followed immediately by its reversal". But if W3C XML Schema has already passed the border from regular grammar into the land of context free grammar, why not investigate the limits of a CFG, before introducing yet another, unrelated technology? In fact, while I was teaching myself this stuff, I borrowed the brain of a mathematician friend of mine (hey, hardly been used, you kn-- owww!). She pointed out (and the various texts eventually explained, somewhat more verbosely) that you can gain quite a bit of power by equipping a finite state machine with a pushdown stack (that is, going from regular language to context free, from nfa/dfa to pushdown automata), so I naturally immediately suggested having *two* ... but was informed that such an automaton is functionally equivalent to a turing machine. Determination of its capabilities is *much* more complex, I gather. But these are ... surely they must be ... issues that the 1.1 W3C XML Schema working group has considered, are they not? With the example of RNG, and the formalisms underlying it, and the literature that increasingly surrounds the characterization of XML as a regular hedge language (consequently a language most comfortably defined by a regular hedge grammar, though not *necessarily* practically most effectively implemented in a hedge automaton), surely the original design point of W3C XML Schema 1.0, of a regular language which could be efficiently processed by a finite automaton (in fact, by a top-down deterministic finite automaton, which is the justification of UPA if I'm not mistaken) was *at least* discussed? The whole question of determinism (the unique particle attribution rule) *ought* to be on the WXS 1.1 WG's radar; it's a sore point with users of the technology (as is co-occurrence constraints, which is being addressed, albeit in a fashion I find rather worthy of remark). > Why stretch one technology to its limits when you've crossed into a > domain > where other technologies do it better? Uh ... like moving from topdown deterministic finite automata to pushdown automata? I'm sorry, but grammar defines language and is associated with a class of automata; WXS 1.0 had a very clear (if, in my opinion, mistaken) design goal in this area. WXS 1.1 has already introduced an exception to the UPA (one implicit in WXS 1.0, as I understand it, but the formal acknowledgement is an advance, or potentially is so), so ... either the NFA/DFA is equipped with a sort of "exception" machinery of indeterminate complexity, or a different class of automata is indicated as the design point. At the risk of repeating myself, I'm just learning stuff that they (supposedly?) teach CS undergraduates, so perhaps I've gone off the rails. If so, I'd be happy to hear in what fashion I've done so (and my employer would prolly be happy if I get unrealistic ideas out of my tiny little head). Amy! -- Amelia A. Lewis amyzing {at} talsever.com The flesh is strong. The spirit stronger. So shed your skin, baby. Let it through. Come on over. -- Amy Ray
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|