[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Running the same transformation on many input file

Subject: Re: Running the same transformation on many input files, optimisation possible?
From: "Dimitre Novatchev dnovatchev@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sun, 15 Dec 2019 19:07:28 -0000
Re:  Running the same transformation on many input file
Looking at my answer, I think that it would be useful to have another
overload of the standard  `collection()` function, but this time, due to
reasons we all know, this would be a Saxon extension function:

saxon:collection($arg as xs:string?, $processWith as function(*),
$multiThread as xs:boolean) b item()*



Cheers,
Dimitre

On Sun, Dec 15, 2019 at 9:49 AM Dimitre Novatchev <dnovatchev@xxxxxxxxx>
wrote:

> I would definitely use the `collection()` function, then would try to
> process the documents in parallel using the `saxon:threads` extension
> attributes with a value dependent on the number of cores on the machine.
>
>
>
https://www.saxonica.com/html/documentation/extensions/attributes/threads.htm
l
>
>
> Trying to generalize this a little bit further, if we have N machines, we
> could send N   HTTP requests (why not using the document() function) giving
> each machine a non-overlapping pattern for the set of files it should
> process.
>
> Of course, besides using `collection()` extensively, I haven't ever tried
> the other stuff I proposed above -- would be really interesting to try.
>
> Cheers,
> Dimitre
>
> On Sun, Dec 15, 2019 at 1:02 AM Trevor Nicholls trevor@xxxxxxxxxxxxxxxxxx
> <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
>> Hi
>>
>>
>>
>> An application I am working on contains a large number of source
>> documents which are all run through the same series of transformations.
>> While initially the build process didn't take long the cost of repeatedly
>> initialising the XSL processor soon adds up, so I am looking at ways to
>> streamline it.
>>
>>
>>
>> Our processor of choice is Saxon (currently we are using 8.7.3) so I can
>> shift this question to the Saxon list if there are extensions there that
>> are relevant.
>>
>>
>>
>> So the question; given a script that essentially includes the following:
>>
>>
>>
>> cd documents
>>
>> for d in `cat dlist`; do
>>
>>   cd $d
>>
>>   for f in `cat flist`; do
>>
>>     java -jar $SAXONDIR/saxon8.jar  -o  $f.new.xml  $f.xml
>>  $SCRIPTDIR/transform.xsl  doc=$d  file=$f
>>
>>   done
>>
>> done
>>
>>
>>
>> is there a mechanism which would allow a single Java process to perform
>> the equivalent?
>>
>>
>>
>> Thanks
>>
>> T
>>
>>
>> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
>> EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/782854> (by
>> email <>)
>>
>
>
> --
> Cheers,
> Dimitre Novatchev
> ---------------------------------------
> Truly great madness cannot be achieved without significant intelligence.
> ---------------------------------------
> To invent, you need a good imagination and a pile of junk
> -------------------------------------
> Never fight an inanimate object
> -------------------------------------
> To avoid situations in which you might make mistakes may be the
> biggest mistake of all
> ------------------------------------
> Quality means doing it right when no one is looking.
> -------------------------------------
> You've achieved success in your field when you don't know whether what
> you're doing is work or play
> -------------------------------------
> To achieve the impossible dream, try going to sleep.
> -------------------------------------
> Facts do not cease to exist because they are ignored.
> -------------------------------------
> Typing monkeys will write all Shakespeare's works in 200yrs.Will they
> write all patents, too? :)
> -------------------------------------
> Sanity is madness put to good use.
> -------------------------------------
> I finally figured out the only reason to be alive is to enjoy it.
>
>


--
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
To avoid situations in which you might make mistakes may be the
biggest mistake of all
------------------------------------
Quality means doing it right when no one is looking.
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play
-------------------------------------
To achieve the impossible dream, try going to sleep.
-------------------------------------
Facts do not cease to exist because they are ignored.
-------------------------------------
Typing monkeys will write all Shakespeare's works in 200yrs.Will they write
all patents, too? :)
-------------------------------------
Sanity is madness put to good use.
-------------------------------------
I finally figured out the only reason to be alive is to enjoy it.

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.