|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: should all XML parsers reject non-deterministic content models?
On Sun, Jan 14, 2001 at 04:42:55PM +0900, TAKAHASHI Hideo(BSD-13G) wrote:
> Hello.
>
> I understand that the XML 1.0 spec prohibits non-deterministic (or,
> ambiguous) content models (for compatibility, to be precise).
Note also that this is stated in a non-normative appendix.
> Are all xml 1.0 compliant xml processing software required to reject
> DTDs with such content models?
Since it is stated as non-normatively only I don't think this is the
case in theory.
In prectice this can be a problem. I recently faced a problem with
a DtD developped at the IETF which was clearly non-determinist. This
also means that this introduce new classes of XML parser among the
validating ones:
- those who detect and report non-determinist content model
- those who validate (correctly) or not using non-determinist
content model
> Ambiguous content models doesn't cause any problems when you construct a
> DFA via an NFA. I have heard that there is a way to construct DFAs
> directly from regexps without making an NFA, but that method can't
> handle non-deterministic regular expressions. If you choose that method
> to construct your DFA, you will surely benefit from the rule in XML 1.0
> . But if you choose not, detecting non-deterministic content models
> become an extra job.
I tried to read the Brüggemann-Klein thesis listed in reference and
found it a bit frightening, though very informative. The beginning
of the Part I on Document Grammar for example makes clear that SGML
view of unambiguity of the content model is really a 1 token lookahead
determinism.
In practice this is a very good rule because it allows to simplify
the validation of a content model a lot. Problem is that grammars
need to be rewritten to conform to it (the thesis proves it's always
possible at lest).
> I can see that parsers that allow non-deterministic content models may
> be harmful to the user. The user won't notice that his DTD may be
> rejected by other parsers.
>
> So there seems to be good reason for the XML 1.0 spec to prohibit
> parsers that accept non-deterministic content models. In that case the
> spec not only gives chance for a particular DFA constructing algorithm
> to be used, but effectively recommends the usage of the algorithm.
As usual, such suggestions should also be provided to the spec comment
list so I'm forwarding it to xml-editor@w...,
Daniel
--
Daniel Veillard | Red Hat Network http://redhat.com/products/network/
daniel@v... | libxml Gnome XML toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








