[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Schematron Best Practice: A Schematron schema's area ofre

  • From: noah_mendelsohn@u...
  • To: "Mark Delaney" <MARKD@m...>
  • Date: Tue, 17 Jul 2007 19:15:50 -0400

RE:  Schematron Best Practice:  A Schematron schema's area ofre
Mark Delaney asks:

> Are there order-of-magnitude variations in efficiency, in either memory
> use or time, between alternative languages? If so, are these variations
> essential, or merely a quirk of the available implementations?

I think that in practice that's likely to be more of a question about 
particular implementations, than the languages in general.  Furthermore, 
the actual performance you see will be a function of the decisions made in 
the particular validator you choose and the particular schema features you 
use.  My intuition is that if you use relatively simple XPaths, and if you 
don't use features like identity contraints aggressively in XSD, then the 
theoretically achievable performance for an Schematron schema and an XSD 
schema feels to me like it would be on the same order.  Then again, I 
would expect that if you compare multiple implementations of either one of 
these languages, the performance could vary literally by, say, 2 orders of 
magnitude according to how much attention was paid to performance 
optimization.

As an example of the sorts of factors that can come into play if you want 
to optimize really aggressively, I can (self-servingly) point you to a 
paper my team published at www2006 on a very high performance XSD 
implementation [1].  The message you should get is that with real care, 
XSD can run much faster than you usually see it run.  In fact, you can do 
validations with roughly zero overhead compared to many of the better 
nonvalidating parsers you've seen (e.g. Expat), at least in many cases. 
Then again, that implementation is very complex, depends on advance 
compilation, has never been productized for general use, and is not 
typical of what's likely available to you in practice today.  Still, it 
shows what can be done with the XSD language.

I have less experience with Schematron, but my intuition is that a really 
aggressive Schematron implementation could reach similar speeds, 
especially with XPaths that stream well, or that can be rewritten to 
stream well (Fabio Vitali's team in Bologna claims that's pretty much all 
XPaths.)  Streaming and good validation performance often go together, 
since any non-streaming behavior tends to mean you're revisiting things 
you've seen before, and that's overhead.   See our paper for more on that 
line of thinking. 

I also have the intuition that Schematron has often been used in 
interactive scenarios, where performance tuning is likely a lower 
priority.  Certainly running a general purpose XSL transform to create 
reports is likely to be slow, unless you have a very, very aggressively 
tuned XSL processor.

So, I think it's likely to be the implementations rather than the 
languages that you want to look at, unless your use of Schematron turns 
out to require very general XSL processing, in which case it could be 
significantly slower.  That last claim is based completely on intuition, 
and Rick is welcome to suggest that I am wildly off base (I suppose he's 
welcome to do that on all the claims above.)

Noah

[1] http://www2006.org/programme/item.php?id=5011

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------






[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.