RE: Schematron Best Practice: A Schematron schema's area

From: "Rick Jelliffe" <rjelliffe@a...>
To: xml-dev@l...
Date: Wed, 18 Jul 2007 11:09:09 +1000 (EST)

Play the video


> Mark Delaney asks:
>
>> Are there order-of-magnitude variations in efficiency, in either memory
>> use or time, between alternative languages? If so, are these variations
>> essential, or merely a quirk of the available implementations?

There is probably order-of-magnitude differences between alternative
implementations of the same language, let alone between languages!
(Certainly this is true with XSLT-based systems.)

The primary issue is whether a constaint can be tested in
 1) Streaming order with no state saved
 2) Streaming order with state (or values) saved (e.g. ID checking)
 3) random access

What grammar-based schema language do is limit themselves to 1), then also
allow some convenient number of 2) where some abstraction can be used to
make it coherent why the grammar model has been sidestepped (ALL, ID,
etc).  What the default Schematron implementation does is start from 3)
using XSLT1, then allow implementations to figure out optimizations if
random access is not allowed.  For example, an implementation could split
up a schema so that the streamable constraints are tested first (e.g. as
the DOM is being built), then the random-access constraints are checked
when the DOM is ready.

More than this, ISO DSDL looks like adopting the STX streaming XPath
language. When Schematron is used with this, then you certainly get a
streaming implementation that would not have object creation overhead.

The other aspect is that you tend to express different things in
difference schema languages: the grammars force you to pay a lot of
attention to sequencing issues and are at all good with partial orders.
Ticking through a big state machine is very easy, but when the state
transitions don't reflect business requirements they may be a burden and a
cost. Furthermore, the grammars actively discourage separation of
concerns: each stakeholder, agent and process in the pipeline may have
different, uncordinated and independent constraints. Grammars, notatbly
XSD, have proved themselves to be unattractive for validation: people
choose not to validate because with XSD and grammars they have to
over-validate (validate things they are not interested in, and omit to
validate things they are interested in) without getting useable
diagnostics. So when considering efficiency, are systems that promote, in
effect, no validation actually more "efficient" than systems that promote
effective partial validation...

Cheers
Rick Jelliffe

Follow-Ups:
- RE: Schematron Best Practice: A Schematron schema's area ofresponsibility?
  - From: "Len Bullard" <cbullard@h...>

References:
- RE: Schematron Best Practice: A Schematron schema's area of responsibility?
  - From: "Mark Delaney" <MARKD@m...>
- RE: Schematron Best Practice: A Schematron schema's area ofresponsibility?
  - From: noah_mendelsohn@u...

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Subscribe in XML format

RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >