|
[XQuery Talk Mailing List Archive Home] [By Date] [By Thread] [By Subject] [By Author] [Recent Entries] [Reply To This Message] Tool development: by Perl-wrapped XQueryHans-Juergen Rennau hrennau at yahoo.deFri Sep 4 10:59:19 PDT 2009
Hello David,
scripting and tool devlopment using script-wrapping are akin, but not quite the same. It seems to me important to distinguish clearly between two usecases:
1. the tool's functionality is provided by executing XQuery once (or only a few times)
2. the tool's functionality is provided by executing XQuery very often (as in your tests, hundreds of times)
Almost all tools I have developed so far fall into category 1. Obviously there is NO PERFORMANCE PENALTY worth mentioning in this case: I believe the difference between calling the query via its Perl script or alternatively calling it directly from the command line would not be perceptible for a human being.
It is not coincidence that category 1 can VERY often be used - rather, it is exactly the sequence philosophy which makes it a common case! Example: if you have a query processing a document received as context item, then one can modify the query to process any number of input documents in succession, producing an aggregation of results by just adding 4 (!) lines:
declare variable $fileNames external;
declare variable $docs := for $f in tokenize(fileNames, "\s*") return doc($f);
<collection>{for $d in $docs return
- now comes the original query with this modification: prepend $d before any path starting with / or // -
}</collection>
All that is left to be done is pass the names of the input documents as a space separated string to the query. Which is a trivial task for the little Perl section preceding the invocation.
If one ignores this dependence on usecase, one arrives at the situation described by Michael Kay on http://www.saxonica.com/questions.html, last question: "Is it true that Altova XML is three times as fast as Saxon?"
I never cease to be amazed by exactly this quality of XQuery - its ability to integrate vast sets of input resources with almost no increased effort as compared to the processing of a single resource.
With kind regards,
Hans-Juergen Rennau
----- Ursprüngliche Mail ----
Von: David A. Lee <http://x-query.com/mailman/listinfo/talk>
An: Hans-Juergen Rennau <http://x-query.com/mailman/listinfo/talk>
CC: http://x-query.com/mailman/listinfo/talk; http://x-query.com/mailman/listinfo/talk
Gesendet: Freitag, den 4. September 2009, 04:09:21 Uhr
Betreff: Re: Tool development: by Perl-wrapped XQuery
In my opinion the problem with doing this in perl is that unless the xquery implementation itself is in perl > or atleast runs within the same process you will run into horrible performance problems. See my (with Norm Walsh's) paper :
http://www.balisage.net/Proceedings/vol4/author-pkg/Lee01/BalisageVol4-Lee01.html
What we found is that for our test cases there is a 100-200x (yes 10000 % - 20000 % ) performance penalty of using a scripting language to call xml processing programs. This *can* be optimized but the exact use cases of using a off-the-shelf scripting language to do this kind of thing is typically by the audience of people who do not want to spend the extra effort to optimize it, or who are not experts in the type of software development/languages required to do it, or both. i.e its exactly why they are using scripting - so they don't have to do all that extra work.
This is the primary reason xmlsh was invented instead of re-using an existing scripting language. I took a "toy" program in a scripting language it worked great.
But when I loaded up all the files I needed it to run it died a horrible death. This is what I call "The Brick Wall" and why scripting XML processes fail so many of us. The presentation cited above has some good charts and figures as well as the full test case code.
This is why I suggest either (both)
* Use a scripting language that already is 'in process' with all the XML core languages you want to use (xquery, xslt etc)
-> examples XProc, xmlsh
* Encourage scripting languages developers to embed these XML languages directly into the scripting languages (say perl).
-> This is hard work and may in fact involve re-implementing many of the core tools from scratch.
-> Some of the work is done but is incomplete ... I've seen references to XSLT implementions native in perl where the author quoted something like
"This isnt a complete implementation of XSLT 1.0 but it works pretty good for me".
-- David A. Lee
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|






