|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: should all XML parsers reject non-deterministic content models?
Hi, I'm afraid I've lost the thread of these arguments - could someone please give me a easy definition of what is meant by determinism (in the context of content models), ambiguity (in the context of non-determinism) and how these are being considered (in the context of conformance and compliance to Recommendations). A comprehensible (normative) explanation would be preferred. Are we talking about a processor which knows a closed set of parameters or one with a default catch-all : "I have no idea what you are on about?" Expansions of the acronyms DFA and NFA would also be appreciated. Cheers, Danny. > -----Original Message----- > From: Joe English [mailto:jenglish@f...] > Sent: 14 January 2001 23:58 > To: xml-dev@l... > Subject: Re: should all XML parsers reject non-deterministic content > models? > > > > TAKAHASHI Hideo wrote: > > > I understand that the XML 1.0 spec prohibits non-deterministic (or, > > ambiguous) content models (for compatibility, to be precise). > > Are all xml 1.0 compliant xml processing software required to reject > > DTDs with such content models? > > No: a processor can ignore the DTD entirely and still be compliant. > And since the prohibition against non-deterministic content models > appears in a non-normative appendix, I would presume that conforming > DTD-aware processors are not required to detect this condition either. > Even in full SGML, ambiguous content models are a "non-reportable > markup error", i.e., parser don't need to detect this condition. > > > Ambiguous content models doesn't cause any problems when you construct a > > DFA via an NFA. I have heard that there is a way to construct DFAs > > directly from regexps without making an NFA, but that method can't > > handle non-deterministic regular expressions. > > There are many, many other ways to validate documents against content > models though. Take a look at James Clark's TREX implementation, > which has no problem with ambiguity, and also efficiently handles > intersection, negation, and interleaving of content models > (the first two of which are *very* expensive in a DFA-based > approach). > > > > If you choose that method > > to construct your DFA, you will surely benefit from the rule in XML 1.0 > > . But if you choose not, detecting non-deterministic content models > > become an extra job. > > But note that detecting ambiguity in XML content models is considerably > simpler than in SGML -- the really difficult part involves '&' groups > which aren't present in XML. > > > --Joe English > > jenglish@f... >
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








