[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Deterministic Content Models ?

  • From: Richard Goerwitz <richard@g...>
  • To: xml-dev@i...
  • Date: Sun, 13 Sep 1998 16:15:09 -0400

nondeterministic content model
Philippe Le Hégaret wrote:

> > Is (paragraph*)* a deterministic content model ?
> > If yes, so I think (a+ | b)* is a deterministic content model too.
> > >
> > >   it is an error if an element in the document can match more
> > >   than one occurrence of an element type in the content model.
>
>   I'm not totally agree with you, because if you write the
> sequence like this:
>
>     (a, a*)*
>
> is it still deterministic ? For me no, because there are
> two states in this content model. (a+)* is the same case and
> (a+ | b)* too.

Looks like everybody is more or less correct.

The whole point of flagging nondeterministic content models (which
is what SGML did, and XML may optionally do) is that nondetermin-
istic content models often indicate logic errors by the writer.

Put somewhat differently, if a DTD writer composes a content model
that allows a given sequence of elements to be processed in more
than one way, this often indicates an error.

So, for example, with (a, a*)*, it's hard to imagine what is
intended, because a single <a/><a/> could match two instances of
(a, a*), or one instance if (a, a*), depending on how you go
through the automaton.  Processors may, incidentally, flag (a+)*
as "ambiguous", since a+ usually implemented as (a, a*).

Such ambiguities create unintended differences in how the same
input might be processed by different software.  Or they simply
lead to the input being processed in a way the surprises the user
(or worse yet, the programmer).

That's why I think it's a good idea for validators, in particular,
to flag "ambiguous" content models aggressively.

To test these sorts of things is easy enough.  Just make up a toy
DTD and run it through a good validator.  Take, for example, the
following (where elements x, y, and z should get flagged as "am-
biguous"):

<!DOCTYPE test [
  <!ELEMENT test ANY>
  <!ELEMENT a EMPTY>
  <!ELEMENT b EMPTY>
  <!ELEMENT w (a*)*>
  <!ELEMENT x (a+ | b)*>
  <!ELEMENT y (a, a*)*>
  <!ELEMENT z (a+, b?, a+)>
]>

<test></test>

Yes, as always, you can try this out with the validator at:

  http://www.stg.brown.edu/service/xmlvalid/

-- 

Richard Goerwitz
PGP key fingerprint:    C1 3E F4 23 7C 33 51 8D  3B 88 53 57 56 0D 38 A0
For more info (mail, phone, fax no.):  finger richard@g...

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@i... the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@i... the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@i...)


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.