[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Blowout

  • From: Joe English <jenglish@f...>
  • To: xml-dev@l...
  • Date: Mon, 05 Feb 2001 08:49:35 -0800

innovation blowout

Rick Jelliffe  wrote:

> But now it occurs to me that perhaps there are some basic questions we can
> ask (of a schema implementer or language designer) which may give a head
> start in the absense of benchmarking:
>
>  - Are there any innocent-looking structures that explode (or may explode)
> (perhaps this is the same as asking are there any constructs which, when
> used, may have more than an O(n) effect on performance) and what are the
> workarounds (e.g. in XML Schemas case, to detect unbounded particles in
> unbounded choice groups) ?

I don't think there are any explosive constructs in TREX.
In James' reference implementation of TREX (if I understand it
correctly),  it takes approximately O(N) time and space to preprocess
the schema, since this phase amounts to little more than constructing
an abstract syntax tree.   During validation, each start-tag, attribute,
end-tag, and #PCDATA particle can be processed in O(1) amortized time.
(The _first_ time a particular state/symbol pair is encountered, it
takes O(pattern-size) time to compute the next state, but this result is
cached so the next time around it takes constant time.  As the
size of the document approaches infinity, this averages out to O(1)
per input symbol.)

Space requirements are modest as well: storage requirements
include the stack of open elements from the input document,
the AST of the schema, and a lazily-constructed DFA transition
table.  The space requirements are largely independent of
the size of the input document, so TREX is suitable for
large input sets.

Worst-case space requirements involving INTERLEAVE and CONCUR are
also handled efficiently.  An INTERLEAVE pattern containing N element
tokens can lead to a DFA with 2^N states.  However, states are only
constructed as they are needed so the implementation will only use O(2^N)
space for such a pattern if the input document actually contains
sequences for each possible permutation.

I believe that RELAX can be implemented in the same way, though
I don't know if any of the existing implementations do so.

--Joe English

  jenglish@f...

  • Follow-Ups:
  • References:
    • Blowout
      • From: Rick Jelliffe <ricko@a...>

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.