Subject:big input file for XSLT Author:Fairy Lee Date:06 Feb 2009 03:39 PM
I need to run XSLT with a big input file about 60M. It looks Stylus Studio runs forever. I think it should be the memory problem. However, after I add -Xmx512M into Parameters of Java VM from Options menu, it reports error, like "can't find out INI something" and then Stylus Studio becomes crazy so that I can't open it properly. I also try to run XSLT from command line, but no lucky. Any advices? Really appreciate.
Subject:big input file for XSLT Author:Fairy Lee Date:06 Feb 2009 03:47 PM
Additional information:
My input file size is 69M.
The maximum size of Allocated stack size of XSLT I can set in Stylus Studio is 64M.
Is it the problem?
How to enlarge the maximum setting for Allocated Stack Size?
Subject:big input file for XSLT Author:Minollo I. Date:06 Feb 2009 05:34 PM
You are likely just running into memory troubles; I assume you are using the (default) Saxon XSLT processor, right? (XSLT > Scenario settings).
You can increase your heap, but you may want to be careful because if you set it to a value that cannot be achieved on your machine, the JVM will fail to initialize. The stack size is not a problem if you are running Saxon (which is what we would suggest doing anyway).
You may want to consider using XQuery; the embedded DataDirect XQuery is a highly scalable engine.
Subject:big input file for XSLT Author:Fairy Lee Date:08 Feb 2009 10:59 PM Originally Posted: 08 Feb 2009 10:43 PM
Thank you for your relpy.
Yes, you are correct. I am running XSLT by Saxon XSLT processor. I add -Xmx256m to parameters of JVM, but it still doesn't work. I couldn't add more value, like -Xmx512m, because just like what you said, the JVM failed to initialize.
Question 1: if I use command line to run XSLT, like StylusXslt, how can I set the heap value for it?
Question 2: are there any other way to improve the memory of XSLT processor?
Question 3: I am not well with xQuery sibling. Could you please give me some suggestions about my sample if I use xQuery to replace XLST code?
My input XML structure likes this:
<ROOT>
<STATMENT>
.......
</STATEMENT>
<ACCOUNT>
...
</ACCOUNT>
<TRANSACTION>
....
</TRANSACTION>
<TRANSACTION>
....
</TRANSACTION>
<ACCOUNT>
....
</ACCOUNT>
<TRANSACTION>
....
</TRANSACTION>
</ROOT>
I want to get the output XML like this:
<ROOT>
<STATMENT>
.......
<ACCOUNT>
...
<TRANSACTION>
....
</TRANSACTION>
<TRANSACTION>
....
</TRANSACTION>
</ACCOUNT>
<ACCOUNT>
...
<TRANSACTION>
....
</TRANSACTION>
</ACCOUNT>
</STATEMENT>
</ROOT>
How to use xQuery child syntax to structure this XML?
Subject:big input file for XSLT Author:Minollo I. Date:08 Feb 2009 11:06 PM
In the XSLT Scenario properties you can choose which processor to use to "process" your stylesheet; Saxon is the default processor used by Stylus Studio; "built-in" is a simpler XSLT processor built-in Stylus Studio and good mostly for debugging (step-by-step execution). You are apparently using Saxon, so, no need to worry there.
This XQuery may be a start:
declare function local:getRelatedAccounts($item) {
let $nextItem :=$item/following-sibling::*[local-name()!="ACCOUNT" and local-name()!="TRANSACTION"][1]
for $related in $item/following-sibling::*[local-name()="ACCOUNT"]
where if ($nextItem) then $related << $nextItem else true()
return $related
};
declare function local:getRelatedTransactions($item) {
let $nextItem := $item/following-sibling::*[local-name()!="TRANSACTION"][1]
for $related in $item/following-sibling::*[local-name()="TRANSACTION"]
where if($nextItem) then $related << $nextItem else true()
return $related
};
<ROOT> {
for $statement in /ROOT/STATEMENT
return
<STATEMENT> {
for $account in local:getRelatedAccounts($statement)
return
<ACCOUNT name="{$account/string()}"> {
for $transaction in local:getRelatedTransactions($account)
return
<TRANSACTION> {
$transaction/string()
} </TRANSACTION>
} </ACCOUNT>
} </STATEMENT>
} </ROOT>
Subject:big input file for XSLT Author:Fairy Lee Date:09 Feb 2009 12:27 PM Originally Posted: 09 Feb 2009 12:14 PM
Thank you for your great help. Two more questions:
Question 1: could you please give more explaination about "where if ($nextItem)then $related<<$nextItem else true()". I am confused about this condition.
Question 2: if there is a statement trailer record that should be put in STATEMENT, but after ACCOUNT. How to code it?
I add the following code, but the result is not correct.
declare function local:getRelatedStatTrailer($item) {
let $nextItem :=$item/following-sibling::*[local-name()!="STATEMENTTRAILER"][1]
for $related in $item/following-sibling::*[local-name()="STATEMENTTRAILER"]
where if ($nextItem) then $related << $nextItem else true()
return $related
};
.....
<STATMENTS>
{
<STATEMENT>
{
<ACCOUNT>
</ACCOUNT>
{
for $stattrailer in local:getRelatedStatTrailer($statement)
return
<TOTAL_RECORDS>{$stattrailer/TOTAL-RECORDS/text()}</TOTAL_RECORDS>
}
}</STATEMENT>
}</STATEMENTS>
If there are two statements, the first statement returns both the first trailer and the second trailer; the second statement is ok.
And also there are more than one element in $stattrailer, but if I don't want to add <Trailer></Trailer>, I only can add one emelement into the code. If I add more, it will report error. For example, I also want to add <TOTAL_ACCOUNT>{$stattrailer/TOTAL-ACCOUNT/text()}</TOTAL_ACCOUNT> under <TOTAL_RECORD> directly, how to do it?
Subject:big input file for XSLT Author:Minollo I. Date:09 Feb 2009 12:43 PM
>Question 1: could you please give more explaination about "where if
>($nextItem)then $related<<$nextItem else true()". I am confused about
>this condition.
It's filtering the "related" items to fetch only the ones that come *before* the next container element, or all of them if there are no more container elements.
>Question 2: if there is a statement trailer record that should be put >in STATEMENT, but after ACCOUNT. How to code it?
I think you want to change your function into:
declare function local:getRelatedStatTrailer($item) {
let $nextItem :=$item/following-sibling::*[local-name()="STATEMENT"][1]
for $related in $item/following-sibling::*[local-name()="STATEMENTTRAILER"]
where if ($nextItem) then $related << $nextItem else true()
return $related
};
$nextItem needs to somehow identify the next group; as you don't have sub-groups, in your case it's probably easier to identify the next group just finding the next "STATEMENT".
The code should look something similar to this:
</ROOT>
for $statement in $input/STATEMENT
return
<STATEMENT> {
for $account in local:getRelatedAccounts($statement)
return
<ACCOUNT name="{$account/string()}"> {
for $transaction in local:getRelatedTransactions($account)
return
<TRANSACTION> {
$transaction/string()
} </TRANSACTION>
} </ACCOUNT>,
for $stattrailer in local:getRelatedStatTrailer($statement)
return
<TOTAL_RECORDS>{$stattrailer/TOTAL-RECORDS/text()}</TOTAL_RECORDS>
} </STATEMENT>
} </ROOT>
Subject:big input file for XSLT Author:Fairy Lee Date:09 Feb 2009 01:45 PM
I use your old example to do it, but it doesn't work. I attached the xml input file and xquery file. Could you please take time to look at them?
Also, I want to add
<total_30>{$record11/total_30/text()}</total_30>
under <total_20></total_20>.
I use
{
for $record11 in local:getRelated11($record_10)
return
<total_20>{$record11/total_20/text()}</total_20>
<total_30>{$record11/total_30/text()}</total_30>
}
but it reports error.
It seems I have to use:
<trailer>
<total_20>{$record11/total_20/text()}</total_20>
<total_30>{$record11/total_30/text()}</total_30>
</tailer>
but I don't want to <trailer></trailer> in the output file. Any advices?
Subject:big input file for XSLT Author:Minollo I. Date:09 Feb 2009 02:04 PM
Instead of me trying to guess what you are trying to do, and you trying to map my suggestions on your real usecase, why don't you post a sample/cleaned-up example of your incoming XML and the XML you would like to get as a result? That would speed up things.
Subject:big input file for XSLT Author:Fairy Lee Date:09 Feb 2009 02:31 PM
My case is really like the case in your topic http://www.xml-connection.com/2008/02/convertig-proprietary-file-formats-to.html but the "trailer" part. I updated your sample files and attached them in my previous message. After updaing, it is exact same as my case. Could you please take a look my attached file? This is the easiest way for me because the raw data file is very complex and it is hard for me to create a sample data file for you. Thanks a lot.
By the way, could you please tell me where I can get the syntax for xQuery function, like following-sibling, <<,... and so on.
Subject:big input file for XSLT Author:Minollo I. Date:09 Feb 2009 04:15 PM
A very small mistake; change != into = when getting the nextItem:
declare function local:getRelated11($item) {
let $nextItem := $item/following-sibling::*[local-name()="record_10"][1]
for $related in $item/following-sibling::*[local-name()="record_11"]
where if($nextItem) then $related << $nextItem else true()
return $related
};
Subject:big input file for XSLT Author:Fairy Lee Date:09 Feb 2009 08:18 PM
I really appreciate your great help!
Could you please take a look my another question?
I use
{
for $record11 in local:getRelated11($record_10)
return
<total_20>{$record11/total_20/text()}</total_20>
<total_30>{$record11/total_30/text()}</total_30>
}
but it reports error.
It seems I have to use:
<trailer>
<total_20>{$record11/total_20/text()}</total_20>
<total_30>{$record11/total_30/text()}</total_30>
</tailer>
but I don't want to have <trailer></trailer> in the output file. Any advices?