[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: Schematron Best Practice: A Schematron schema's area ofre
Mark Delaney asks: > Are there order-of-magnitude variations in efficiency, in either memory > use or time, between alternative languages? If so, are these variations > essential, or merely a quirk of the available implementations? I think that in practice that's likely to be more of a question about particular implementations, than the languages in general. Furthermore, the actual performance you see will be a function of the decisions made in the particular validator you choose and the particular schema features you use. My intuition is that if you use relatively simple XPaths, and if you don't use features like identity contraints aggressively in XSD, then the theoretically achievable performance for an Schematron schema and an XSD schema feels to me like it would be on the same order. Then again, I would expect that if you compare multiple implementations of either one of these languages, the performance could vary literally by, say, 2 orders of magnitude according to how much attention was paid to performance optimization. As an example of the sorts of factors that can come into play if you want to optimize really aggressively, I can (self-servingly) point you to a paper my team published at www2006 on a very high performance XSD implementation [1]. The message you should get is that with real care, XSD can run much faster than you usually see it run. In fact, you can do validations with roughly zero overhead compared to many of the better nonvalidating parsers you've seen (e.g. Expat), at least in many cases. Then again, that implementation is very complex, depends on advance compilation, has never been productized for general use, and is not typical of what's likely available to you in practice today. Still, it shows what can be done with the XSD language. I have less experience with Schematron, but my intuition is that a really aggressive Schematron implementation could reach similar speeds, especially with XPaths that stream well, or that can be rewritten to stream well (Fabio Vitali's team in Bologna claims that's pretty much all XPaths.) Streaming and good validation performance often go together, since any non-streaming behavior tends to mean you're revisiting things you've seen before, and that's overhead. See our paper for more on that line of thinking. I also have the intuition that Schematron has often been used in interactive scenarios, where performance tuning is likely a lower priority. Certainly running a general purpose XSL transform to create reports is likely to be slow, unless you have a very, very aggressively tuned XSL processor. So, I think it's likely to be the implementations rather than the languages that you want to look at, unless your use of Schematron turns out to require very general XSL processing, in which case it could be significantly slower. That last claim is based completely on intuition, and Rick is welcome to suggest that I am wildly off base (I suppose he's welcome to do that on all the claims above.) Noah [1] http://www2006.org/programme/item.php?id=5011 -------------------------------------- Noah Mendelsohn IBM Corporation One Rogers Street Cambridge, MA 02142 1-617-693-4036 --------------------------------------
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|