[XQuery Talk Mailing List Archive Home] [By Date] [By Thread] [By Subject] [By Author] [Recent Entries] [Reply To This Message]

Tool development: by Perl-wrapped XQuery

David A. Lee dlee at calldei.com
Fri Sep 4 14:10:37 PDT 2009


  Tool development: by Perl-wrapped XQuery
I agree that if you stick to use case #1 (only call xquery once or few 
times) and all your input and output are files (or file names) then 
pretty much any scripting language that lets you launch a subprocess is 
equivalent and you will have very little performance hit.  
Historically, I suggest many people use that use case primarily because 
its horrendously expensive not to (in most languages) so they have 
instead built 'monolithic' programs that do all the work within 1 
program (xquery, xslt etc) and take a list of filenames input and maybe 
filenames output to process.
This does work.

My point is that I suggest that programming model is no longer 
necessary, and was arrived at due to the poor performance of doing it 
any other way,
not because its a desirable way of coding.
If you use a language that can call xquery *efficiently* you can design 
in a more modular way and use 'itty bitty' xquery programs run thousands 
of times instead of one huge program run once,  just as efficiently, and 
in my opinion much easier to develop, debug, and understand.  It frees 
you to design and develop in modular ways, using languages of your 
choice for different operations instead of forcing you into a single 
monolithic application.
Of course it is a personal *opinion* that that is a 'better' way of 
doing things :)  I certainly enjoy it more.
Others may prefer writing one huge program that does everything.  
This exact same philosophical (and technical) debate is what 
distinguished unix from the mainframe model of programming.  I find it 
fascinating that the same fundamental design issues and discussions 
evolved again 40 years later.



David A. Lee
http://x-query.com/mailman/listinfo/talk  
http://www.calldei.com
http://www.xmlsh.org
812-482-5224



Hans-Juergen Rennau wrote:
> Hello David,
>
> scripting and tool devlopment using script-wrapping are akin, but not quite the same. It seems to me important to distinguish clearly between two usecases:
>
> 1. the tool's functionality is provided by executing XQuery once (or only a few times)
> 2. the tool's functionality is provided by executing XQuery very often (as in your tests, hundreds of times)
>
> Almost all tools I have developed so far fall into category 1. Obviously there is NO PERFORMANCE PENALTY worth mentioning in this case: I believe the difference between calling the query via its Perl script or alternatively calling it directly from the command line would not be perceptible for a human being.
>
> It is not coincidence that category 1 can VERY often be used - rather, it is exactly the sequence philosophy which makes it a common case!  Example: if you have a query processing a document received as context item, then one can modify the query to process any number of input documents in succession, producing an aggregation of results by just adding 4 (!) lines:
>
> declare variable $fileNames external;
> declare variable $docs := for $f in tokenize(fileNames, "\s*") return doc($f);
> <collection>{for $d in $docs return
>    - now comes the original query with this modification: prepend $d before any path starting with / or // -
> }</collection>
>
> All that is left to be done is pass the names of the input documents as a space separated string to the query. Which is a trivial task for the little Perl section preceding the invocation.
>
> If one ignores this dependence on usecase, one arrives at the situation described by Michael Kay on http://www.saxonica.com/questions.html, last question: "Is it true that Altova XML is three times as fast as Saxon?"
>
> I never cease to be amazed by exactly this quality of XQuery - its ability to integrate vast sets of input resources with almost no increased effort as compared to the processing of a single resource.
>
> With kind regards,
> Hans-Juergen Rennau
>
>
>
>
> ----- Ursprüngliche Mail ----
> Von: David A. Lee <http://x-query.com/mailman/listinfo/talk>
> An: Hans-Juergen Rennau <http://x-query.com/mailman/listinfo/talk>
> CC: http://x-query.com/mailman/listinfo/talk; http://x-query.com/mailman/listinfo/talk
> Gesendet: Freitag, den 4. September 2009, 04:09:21 Uhr
> Betreff: Re:  Tool development: by Perl-wrapped XQuery
>
>
> In my opinion the problem with doing this in perl is that unless the xquery implementation itself is in perl > or atleast runs within the same process you will run into horrible performance problems.   See my (with Norm Walsh's) paper :
>
> http://www.balisage.net/Proceedings/vol4/author-pkg/Lee01/BalisageVol4-Lee01.html
>
> What we found is that for our test cases there is a 100-200x (yes 10000 % - 20000 % ) performance penalty of using a scripting language to call xml processing programs.   This *can* be optimized but the exact use cases of using a off-the-shelf scripting language to do this kind of thing is typically by the audience of people who do not want to spend the extra effort to optimize it, or who are not experts in the type of software development/languages required to  do it, or both. i.e its exactly why they are using scripting - so they don't have to do all that extra work.
>
> This is the primary reason xmlsh was invented instead of re-using an existing scripting language.   I took a "toy" program in a scripting language it worked great.
> But when I loaded up all the files I needed it to run it died a horrible death.   This is what I call "The Brick Wall" and why scripting XML processes fail so many of us.   The presentation cited above has some good charts and figures as well as the full test case code.
>
> This is why I suggest either (both)
>
> * Use a scripting language that already is 'in process' with all the XML core languages you want to use (xquery, xslt etc)
> -> examples XProc, xmlsh
>
> * Encourage scripting languages developers to embed these XML languages directly into the scripting languages (say perl).
> -> This is hard work and may in fact involve re-implementing many of the core tools from scratch.
> -> Some of the work is done but is incomplete ... I've seen references to XSLT implementions native in perl where the author quoted  something like
>   "This isnt a complete implementation of XSLT 1.0 but it works pretty good for me".
>
>
> -- David A. Lee
>
>
>       
>
>
> _______________________________________________
> http://x-query.com/mailman/listinfo/talk
> http://x-query.com/mailman/listinfo/talk
>   


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2011 All Rights Reserved.