[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: memory usage of xslt processing
Am Wed, 19 Apr 2006 13:59:08 +0100 schrieb "Michael Kay" <mike@xxxxxxxxxxxx>: > XSLT processors generally read the whole document into memory. Some > products may be able to avoid this under certain circumstances, for > example see > http://www.saxonica.com/documentation/sourcedocs/serial.html for > Saxon. I have to use Xalan and I heard of "SQL extensions". I have to try it out. > > Running one transformation per row is certainly feasible in principle > though there may be a significant start-up overhead - you'll only > find out by measurement. Yes, but http://randspringer.de/sax_row.tar gives me an error currently. And it is "ugly" because I have to provide the header by myself. > > Alternatively, why not retrieve the data from the database in > transformer-sized chunks? It does not remove the problem with the header. Of course it should be faster to call stylesheet processing for multiple rows instead for a single row. As next step I will have a look at http://stx.sourceforge.net/ and http://joost.sourceforge.net/. Thank you, Thomas > > Michael Kay > http://www.saxonica.com/ > > > -----Original Message----- > > From: Thomas Porschberg [mailto:thomas.porschberg@xxxxxxxxx] > > Sent: 19 April 2006 13:36 > > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx > > Subject: memory usage of xslt processing > > > > Hi, > > > > I have the following task: > > Create an arbitrary formatted file (XML/HTML/CSV whatever) > > based on a Select from a database. > > > > As a constraint the amount of data fetched from the database > > can not be stored in memory as a whole. > > Another constraint is that I can not use XML-functionality in > > the database, I have to implement the functionality on top of > > our database access framework. This database access framework > > fetches record for record one after another. > > And I have to use Java and Xalan. > > > > My idea was to decorate every fetched row from the database > > with simple generic XML and fire this to Xalan. > > > > Let do an example: > > If my result set from the database looks like: > > > > ID Name Description > > -- ---- ----------- > > 1 "dog" "an animal may be dangerous" > > 2 "cat" "an animal likes milk" > > > > I create the following XML: > > > > <?xml version="1.0" encoding="UTF-8"?> > > <dataset> > > <row> > > <value>1</value> > > <value>dog</value> > > <value>an animal may be dangerous</value> </row> <row> > > <value>2</value> > > <value>cat</value> > > <value>an animal likes milk</value> > > </row> > > </dataset> > > > > I create this XML as "Sax fire events" in an java > > class[StringArrayXMLReader], which implements the > > org.xml.sax.XMLReader interface. > > I have three methods: > > > > public void init() throws SAXException { > > ch.startDocument( ); > > ch.startElement("","dataset","dataset",EMPTY_ATTR); > > } > > > > public void close() throws SAXException { > > ch.endElement("","dataset","dataset"); > > ch.endDocument( ); > > } > > > > public void parse(String [] input) throws SAXException { > > ch.startElement("","row","row",EMPTY_ATTR); > > for (int i = 0; i< input.length; ++i){ > > ch.startElement("","value","value",EMPTY_ATTR); > > ch.characters(input[i].toCharArray(), > > 0,input[i].length( )); > > ch.endElement("","value","value"); > > } > > ch.endElement("","row","row"); > > } > > > > The parse method creates the <row>...</row> entries for an > > overhanded String array. > > The StringArrayXMLReader is associated with a > > TransformerHandler, which uses a XSL stylesheet to transform > > the XML to the desired output. > > > > What happens here is, that when the fetch from the database > > starts I call init() ( and thus startDocument() ) and at > > last, after the fetch finished, I call close() (and thus > > endDocument()). > > I observed that the xslt processing starts when endDocument() > > is called. > > This is not acceptable for me because I fear the xslt > > processor reads all the rows into memory until endDocument() > > is called and in this case I take a risk to run in OutOfMemory. > > > > My second idea was to eliminate the init()/close() methods > > and to consider one <row>...</row> section as complete > > document input for the processor. This has the disadvantage > > that I have to create the head and tail of the document > > manually (and in my example I get a NullPointerException when > > I the transformer is called twice). > > > > I have the following questions: > > Is it possible to create the output without having the whole > > data in memory ? > > The basis XML for xslt processing > > <dataset> > > <row><value>... > > <row><value>... > > </dataset> > > looks very simple and the supplied XLS stylesheets will be > > not complex so my hope is to get it working. > > I also think that the task in general - produce formatted > > output from a potential very large data pool - should be a common > > one. Unfortunately I did not do much xslt-processing in the past > > so I lack the experience (a bit libxslt which I feed a DOM tree). > > If someone has some striking links I would very glad to hear. > > My test code I provide at: > > > > http://randspringer.de/sax_row.tar and > > http://randspringer.de/sax.tar > > > > If someone could have a look at it I would really appreciate it. > > > > Thomas > > > > > > -- > > > --
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|