|
[XQuery Talk Mailing List Archive Home] [By Date] [By Thread] [By Subject] [By Author] [Recent Entries] [Reply To This Message] Tool development: by Perl-wrapped XQueryDavid A. Lee dlee at calldei.comFri Sep 4 14:10:37 PDT 2009
I agree that if you stick to use case #1 (only call xquery once or few
times) and all your input and output are files (or file names) then
pretty much any scripting language that lets you launch a subprocess is
equivalent and you will have very little performance hit.
Historically, I suggest many people use that use case primarily because
its horrendously expensive not to (in most languages) so they have
instead built 'monolithic' programs that do all the work within 1
program (xquery, xslt etc) and take a list of filenames input and maybe
filenames output to process.
This does work.
My point is that I suggest that programming model is no longer
necessary, and was arrived at due to the poor performance of doing it
any other way,
not because its a desirable way of coding.
If you use a language that can call xquery *efficiently* you can design
in a more modular way and use 'itty bitty' xquery programs run thousands
of times instead of one huge program run once, just as efficiently, and
in my opinion much easier to develop, debug, and understand. It frees
you to design and develop in modular ways, using languages of your
choice for different operations instead of forcing you into a single
monolithic application.
Of course it is a personal *opinion* that that is a 'better' way of
doing things :) I certainly enjoy it more.
Others may prefer writing one huge program that does everything.
This exact same philosophical (and technical) debate is what
distinguished unix from the mainframe model of programming. I find it
fascinating that the same fundamental design issues and discussions
evolved again 40 years later.
David A. Lee
http://x-query.com/mailman/listinfo/talk
http://www.calldei.com
http://www.xmlsh.org
812-482-5224
Hans-Juergen Rennau wrote:
> Hello David,
>
> scripting and tool devlopment using script-wrapping are akin, but not quite the same. It seems to me important to distinguish clearly between two usecases:
>
> 1. the tool's functionality is provided by executing XQuery once (or only a few times)
> 2. the tool's functionality is provided by executing XQuery very often (as in your tests, hundreds of times)
>
> Almost all tools I have developed so far fall into category 1. Obviously there is NO PERFORMANCE PENALTY worth mentioning in this case: I believe the difference between calling the query via its Perl script or alternatively calling it directly from the command line would not be perceptible for a human being.
>
> It is not coincidence that category 1 can VERY often be used - rather, it is exactly the sequence philosophy which makes it a common case! Example: if you have a query processing a document received as context item, then one can modify the query to process any number of input documents in succession, producing an aggregation of results by just adding 4 (!) lines:
>
> declare variable $fileNames external;
> declare variable $docs := for $f in tokenize(fileNames, "\s*") return doc($f);
> <collection>{for $d in $docs return
> - now comes the original query with this modification: prepend $d before any path starting with / or // -
> }</collection>
>
> All that is left to be done is pass the names of the input documents as a space separated string to the query. Which is a trivial task for the little Perl section preceding the invocation.
>
> If one ignores this dependence on usecase, one arrives at the situation described by Michael Kay on http://www.saxonica.com/questions.html, last question: "Is it true that Altova XML is three times as fast as Saxon?"
>
> I never cease to be amazed by exactly this quality of XQuery - its ability to integrate vast sets of input resources with almost no increased effort as compared to the processing of a single resource.
>
> With kind regards,
> Hans-Juergen Rennau
>
>
>
>
> ----- Ursprüngliche Mail ----
> Von: David A. Lee <http://x-query.com/mailman/listinfo/talk>
> An: Hans-Juergen Rennau <http://x-query.com/mailman/listinfo/talk>
> CC: http://x-query.com/mailman/listinfo/talk; http://x-query.com/mailman/listinfo/talk
> Gesendet: Freitag, den 4. September 2009, 04:09:21 Uhr
> Betreff: Re: Tool development: by Perl-wrapped XQuery
>
>
> In my opinion the problem with doing this in perl is that unless the xquery implementation itself is in perl > or atleast runs within the same process you will run into horrible performance problems. See my (with Norm Walsh's) paper :
>
> http://www.balisage.net/Proceedings/vol4/author-pkg/Lee01/BalisageVol4-Lee01.html
>
> What we found is that for our test cases there is a 100-200x (yes 10000 % - 20000 % ) performance penalty of using a scripting language to call xml processing programs. This *can* be optimized but the exact use cases of using a off-the-shelf scripting language to do this kind of thing is typically by the audience of people who do not want to spend the extra effort to optimize it, or who are not experts in the type of software development/languages required to do it, or both. i.e its exactly why they are using scripting - so they don't have to do all that extra work.
>
> This is the primary reason xmlsh was invented instead of re-using an existing scripting language. I took a "toy" program in a scripting language it worked great.
> But when I loaded up all the files I needed it to run it died a horrible death. This is what I call "The Brick Wall" and why scripting XML processes fail so many of us. The presentation cited above has some good charts and figures as well as the full test case code.
>
> This is why I suggest either (both)
>
> * Use a scripting language that already is 'in process' with all the XML core languages you want to use (xquery, xslt etc)
> -> examples XProc, xmlsh
>
> * Encourage scripting languages developers to embed these XML languages directly into the scripting languages (say perl).
> -> This is hard work and may in fact involve re-implementing many of the core tools from scratch.
> -> Some of the work is done but is incomplete ... I've seen references to XSLT implementions native in perl where the author quoted something like
> "This isnt a complete implementation of XSLT 1.0 but it works pretty good for me".
>
>
> -- David A. Lee
>
>
>
>
>
> _______________________________________________
> http://x-query.com/mailman/listinfo/talk
> http://x-query.com/mailman/listinfo/talk
>
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|






