[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: [Summary] Constrain the Number of Occurrences of Elements


number counters

To be clear here. I consider termination to be a desirable formal property of description in data models and programs. I also consider formal pragmatics an essential ingredient of computable systems. Hence, I do not "disapprove" of recursive types (perhaps this is consistent with Bob's data model request) but seek the expression of a pragmatic constraint on the depth of an instance.

Steven



On Aug 8, 2005, at 6:39 PM, Steven Ericsson Zenith wrote:

Bob Foster wrote:
That's exactly backwards. From a formal point of view, unbounded and unlimited depth recursion is highly desirable. If it were an issue, why doesn't the Internet set an upper bound on session length, why don't relational databases require you to specify a maximum number of rows per table, etc.?
The same applies in these cases - the behavior of relational database implementation and the Internet are not authoritative and certainly not ideal formally.
From a practical point of view, a session can't go on forever, a table can't have an infinite number of rows and you can't instantiate an infinite size document. But this has little to do with data modeling.
What is your point, exactly? The same as above applies here.

Thanks for your explanation of the parsing issue.  It isn't clear to me still - forgive me for being dense.  You are talking about validating the schema, or the instance?  Why are you doing these expansions?

With respect,
Steven


> I don't understand why a validator taking into account maxOccurs=N
> should be inefficient - someone needs to explain that to me.  How hard
> can that be?

Let's write <element name="a" minOccurs="1" maxOccurs="4"> using the shorthand "a{1,4}".

This is equivalent to "a a? a? a?" in standard regular expression notation. So an expression like "a{0,30000}" has 30,000 terms when unfolded! If you try to implement this expression with a state transition matrix, as some do, you will find yourself with 30,000^2 terms.

If you try to implement the expression as a deterministic finite state machine, it might seem less problematic. A minimal DFM for the expression is:

       Match     Success   Fail
Start    a          1       -
1        a          2      Next
2        a          3      Next
3        a         Next    Next

So for the expression "a{0,30000}" you 'only' have 30,000 states. But producing a DFM from an ambiguous expression is not trivial; algorithms can require exponential time, space or both.

If you try to use derivative parsing and your parser operates by unfolding the expression (as Jing does) you will not only have 30,000 terms in the original pattern but the expression can expand combinatorially as the parser matches to allow for the possibility it might have to "back up", e.g., in the expression "a{0,30000} b | a{0,29999} c".

Finally, you can implement this as a non-deterministic FSM with some constant times 30,000 states. But parsing with an NFM requires backtracking, which in general can require exponential time.

Special-case hackery can do quite a bit better, e.g., using explicit counters. A DFM with counters has only one state which can be re-entered 29,999 times. I described an analogous approach in my note on bounded constraint derivative parsing a couple of years ago. But these approaches aren't much used in practice today, primarily because of the lack of formal methods for deriving FSM with counters and/or lack of widespread knowledge.

Bob Foster
http://xmlbuddy.com/

> I can imagine allowing recursive types but that they be constrained in
> any instance of the type.  So I would suggest that the term "unbounded"
> be an specifiable constant in a 1.1 schema. I would like to see the
> depth constraints on recursion as a part of a broader constraints system
> in 1.1.
>
> Sorry that I have not been paying close attention - I have been busy.
>
> With respect,
> Steven
> --
> Dr. Steven Ericsson-Zenith



PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.