[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Running the same transformation on many input file

Subject: Re: Running the same transformation on many input files, optimisation possible?
From: "Dimitre Novatchev dnovatchev@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sun, 15 Dec 2019 23:00:42 -0000
Re:  Running the same transformation on many input file
Thank you, Dr. Kay,

> Indeed, I overlooked the possibility of having the controlling loop
written in XSLT 3.0 and executing the per-transformation code using
fn:transform(),
> which gets over the problems of changing existing XSLT code if it relies
on global variables.

Does Saxon implement the semantics of
 *"cache" : true() *

as per the Spec?

Thanks,
Dimitre


On Sun, Dec 15, 2019 at 2:50 PM Michael Kay mike@xxxxxxxxxxxx <
xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:

> Indeed, I overlooked the possibility of having the controlling loop
> written in XSLT 3.0 and executing the per-transformation code using
> fn:transform(), which gets over the problems of changing existing XSLT code
> if it relies on global variables.
>
> Michael Kay
> Saxonica
>
> On 15 Dec 2019, at 22:35, Dimitre Novatchev dnovatchev@xxxxxxxxx <
> xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> > Note that there's a double overhead here: firstly you're bringing up a
> new Java VM for each transformation,
> > and secondly you're recompiling the stylesheet for each transformation.
>
> Isn't the meaning of the
> *    "cache" : true()*
>
> key-value pair in the $*options *argument of `fn:transform()` exactly to
> compile the stylesheet only once and to reuse the compiled result on any
> time in the future the same stylesheet node or stylesheet text is passed?
>
> From the Spec (https://www.w3.org/TR/xpath-functions-31/#func-transform):
>
> cache 1.0, 2.0, 3.0 This option has no effect on the result of the
> transformation but may affect efficiency. The value true indicates an
> expectation that the same stylesheet is likely to be used for more than one
> transformation; the value false indicates an expectation that the
> stylesheet will be used once only.
>
>    - *Type: *xs:boolean
>    - *Default: *true()
>
>
>
> Thanks,
> Dimitre
>
>
> On Sun, Dec 15, 2019 at 2:12 PM Michael Kay mike@xxxxxxxxxxxx <
> xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
>> Note that there's a double overhead here: firstly you're bringing up a
>> new Java VM for each transformation, and secondly you're recompiling the
>> stylesheet for each transformation.
>>
>> You can avoid the Java loading overhead by using ant or XProc, but I'm
>> not sure either of them will avoid the overhead of recompiling the
>> stylesheet; though if you use a a recent Saxon version, you could achieve
>> that by reloading the stylesheet from a pre-compiled SEF (stylesheet export
>> file).
>>
>> You could write your own Java application to control the process,
>> invoking Saxon via the JAXP or s9api APIs - both allow you to compile a
>> stylesheet once and execute it repeatedly.
>>
>> You might be able to write the control loop in XSLT, for example by using
>> the collection() function, or functions in the EXPath file module. However,
>> this could require stylesheet changes if your XSLT code binds global
>> variables to values derived from the source document.
>>
>> In very simple cases you can take advantage of the fact that the -s
>> option for the Saxon command line can be a directory, in which case all the
>> input files are transformed to corresponding files in the -o directory.
>>
>> Michael Kay
>> Saxonica
>>
>> On 15 Dec 2019, at 09:03, Trevor Nicholls trevor@xxxxxxxxxxxxxxxxxx <
>> xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>>
>> Hi
>>
>> An application I am working on contains a large number of source
>> documents which are all run through the same series of transformations.
>> While initially the build process didn't take long the cost of repeatedly
>> initialising the XSL processor soon adds up, so I am looking at ways to
>> streamline it.
>>
>> Our processor of choice is Saxon (currently we are using 8.7.3) so I can
>> shift this question to the Saxon list if there are extensions there that
>> are relevant.
>>
>> So the question; given a script that essentially includes the following:
>>
>> cd documents
>> for d in `cat dlist`; do
>>   cd $d
>>   for f in `cat flist`; do
>>     java -jar $SAXONDIR/saxon8.jar  -o  $f.new.xml  $f.xml
>>  $SCRIPTDIR/transform.xsl  doc=$d  file=$f
>>   done
>> done
>>
>> is there a mechanism which would allow a single Java process to perform
>> the equivalent?
>>
>> Thanks
>> T
>>
>> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
>> EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/293509> (by
>> email)
>>
>>
>> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
>> EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/782854> (by
>> email)
>>
>
>
> --
> Cheers,
> Dimitre Novatchev
> ---------------------------------------
> Truly great madness cannot be achieved without significant intelligence.
> ---------------------------------------
> To invent, you need a good imagination and a pile of junk
> -------------------------------------
> Never fight an inanimate object
> -------------------------------------
> To avoid situations in which you might make mistakes may be the
> biggest mistake of all
> ------------------------------------
> Quality means doing it right when no one is looking.
> -------------------------------------
> You've achieved success in your field when you don't know whether what
> you're doing is work or play
> -------------------------------------
> To achieve the impossible dream, try going to sleep.
> -------------------------------------
> Facts do not cease to exist because they are ignored.
> -------------------------------------
> Typing monkeys will write all Shakespeare's works in 200yrs.Will they
> write all patents, too? :)
> -------------------------------------
> Sanity is madness put to good use.
> -------------------------------------
> I finally figured out the only reason to be alive is to enjoy it.
>
> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
> EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/293509> (by
> email)
>
>
> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
> EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/782854> (by
> email <>)
>


-- 
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
To avoid situations in which you might make mistakes may be the
biggest mistake of all
------------------------------------
Quality means doing it right when no one is looking.
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play
-------------------------------------
To achieve the impossible dream, try going to sleep.
-------------------------------------
Facts do not cease to exist because they are ignored.
-------------------------------------
Typing monkeys will write all Shakespeare's works in 200yrs.Will they write
all patents, too? :)
-------------------------------------
Sanity is madness put to good use.
-------------------------------------
I finally figured out the only reason to be alive is to enjoy it.

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.