[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Streaming with XSLT version 3.0
Hi all,
First of all, I want to thank you all for your opinions and feedback on this thread. Related to the OutOfMemory problem, this can happens if the transformation result is very large and the user choose to see it in the Results view. The Result view uses a simple text area which does not support loading such a large content. This option can be disabled by editing the associated transformation scenario, open the 'Output' tab and unselect all the checkboxes from 'Show in results view as' section. More details about how to configure the transformation scenario output can be found here: http://oxygenxml.com/doc/ug-editor/#topics/the-output-tab.html#the-output-tab I will add a feature request in our issue tracking system to improve the handling of this situation. After I disabled displaying the transformation output in the Results view, I tried to transform a 3 GB file that has a similar structure with the one posted by Terry. In this case I found another problem: the execution time from oXygen is 6 times slower than running the transformation in the command line. This happens because the Saxon-EE schema-based validation (-val:lax) feature is active by default when running a transformation with the Saxon-EE processor. The main feature in the first Saxon-EE versions was the schema-aware validation (-sa switch). So, we assumed that the user choose to run with Saxon-EE because he wants schema aware validation. Meanwhile, the list with features available in Saxon-EE has grown and now there are a lot more reasons to use Saxon-EE. I will add an issue in our bug tracking system to reconsider the default for this option (-val). To disable 'schema-aware validation' option you have to edit the associated scenario and press the 'Advanced Options' button located next to the Saxon-EE processor combo. The 'Advanced Options' button displays a dialog that allows you to customize Saxon-EE processor. In this dialog you have to choose 'Disable schema validation' for 'Validation on source file (-val)' option. In conclusion, without showing the output in the result view and by disabling the schema-aware validation you will get the same execution time when running the transformation from oXygen and from the command line. Regards, Radu -- Radu Pisoi <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com On 3/8/2014 23:14, Terry Badger wrote: MIchael, I did run the process successfully. See my notes here. I have reported it to Oxygen. Details for running a large file with xslt v3 streaming ========== Large source file is found here: http://dumps.wikimedia.org/enwiki/20130403/enwiki-20130403-pages-articles-multistream.xml.bz2 ========== Here is the result of Saxon running for a DOS shell with a respectable 21 minutes and no out-of-memory report C:\Temp\wiki>C:\Progra~2\Java\jre7\bin\java -Xmx180m -Xss4096k -Xms48m -cp C:/saxon/saxon9ee.jar; net.sf.saxon.Transform -TJ -t -it:main -o:C:/Temp/wiki/out/wiki-03-output.xml C:/Temp/wiki/xsl/wiki-03.xsl Saxon-EE 9.5.1.4J from Saxonica Java version 1.7.0_45 Using license serial number V001638 Generating byte code... Stylesheet compilation time: 476 milliseconds Processing (no source document) initial template = main URIResolver.resolve href="../source/enwiki.xml" base="file:/C:/Temp/wiki/xsl/wiki-03.xsl" Using parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser Writing to file:/C:/Temp/wiki/out/output-wiki-03.xml Execution time: 21m 24.612s (1284612ms) Memory used: 25491272 NamePool contents: 28 entries in 27 chains. 7 URIs ========== With this xsl stylesheet <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://www.mediawiki.org/xml/export-0.8/" xpath-default-namespace="http://www.mediawiki.org/xml/export-0.8/" exclude-result-prefixes="#all" version="3.0"> <xsl:output method="xml"/> <xsl:variable name="root" select="/"/> <xsl:mode streamable="yes"/> <xsl:template name="main"> <xsl:stream href="../source/enwiki.xml"> <xsl:result-document href="../out/output-wiki-03.xml"> <count> <xsl:iterate select="mediawiki/page"> <xsl:param name="count" select="0" as="xs:decimal"/> <xsl:next-iteration> <xsl:with-param name="count" select="$count+1"/> </xsl:next-iteration> <xsl:on-completion> <xsl:value-of select="$count"/> </xsl:on-completion> </xsl:iterate> </count> </xsl:result-document> </xsl:stream> </xsl:template> </xsl:stylesheet> ============ With this result file <?xml version="1.0" encoding="UTF-8"?> <count xmlns="http://www.mediawiki.org/xml/export-0.8/%22%3E13355093%3C/count> ============ While running in Oxygen 15.2 with Saxon 9.5.1.3 with same source and stylesheet file after about an hour we had an out of memory error. I have reported it to Oxygen. -- Regards, Radu Radu Pisoi <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|