[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: should all XML parsers reject non-deterministic content models?

  • From: Danny Ayers <danny@p...>
  • To: xml-dev@l...
  • Date: Mon, 15 Jan 2001 02:10:16 +0600

ambiguous non deterministic dtds xml
Hi,
I'm afraid I've lost the thread of these arguments - could someone please
give me a easy definition of what is meant by determinism (in the context of
content models), ambiguity (in the context of non-determinism) and how these
are being considered (in the context of conformance and compliance to
Recommendations). A comprehensible (normative) explanation would be
preferred. Are we talking about a processor which knows a closed set of
parameters or one with a default catch-all : "I have no idea what you are on
about?" Expansions of the acronyms DFA and NFA would also be appreciated.

Cheers,
Danny.

> -----Original Message-----
> From: Joe English [mailto:jenglish@f...]
> Sent: 14 January 2001 23:58
> To: xml-dev@l...
> Subject: Re: should all XML parsers reject non-deterministic content
> models?
>
>
>
> TAKAHASHI Hideo wrote:
>
> > I understand that the XML 1.0 spec prohibits non-deterministic (or,
> > ambiguous) content models (for compatibility, to be precise).
> > Are all xml 1.0 compliant xml processing software required to reject
> > DTDs with such content models?
>
> No: a processor can ignore the DTD entirely and still be compliant.
> And since the prohibition against non-deterministic content models
> appears in a non-normative appendix, I would presume that conforming
> DTD-aware processors are not required to detect this condition either.
> Even in full SGML, ambiguous content models are a "non-reportable
> markup error", i.e., parser don't need to detect this condition.
>
> > Ambiguous content models doesn't cause any problems when you construct a
> > DFA via an NFA.  I have heard that there is a way to construct DFAs
> > directly from regexps without making an NFA, but that method can't
> > handle non-deterministic regular expressions.
>
> There are many, many other ways to validate documents against content
> models though.  Take a look at James Clark's TREX implementation,
> which has no problem with ambiguity, and also efficiently handles
> intersection, negation, and interleaving of content models
> (the first two of which are *very* expensive in a DFA-based
> approach).
>
>
> > If you choose that method
> > to construct your DFA, you will surely benefit from the rule in XML 1.0
> > . But if you choose not, detecting non-deterministic content models
> > become an extra job.
>
> But note that detecting ambiguity in XML content models is considerably
> simpler than in SGML -- the really difficult part involves '&' groups
> which aren't present in XML.
>
>
> --Joe English
>
>   jenglish@f...
>


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.