[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Incremental transformations with Xalan and performance iss


xalan incremental
You might find it better to ask such questions on the xsl-list at
mulberrytech.com, or if you're really interested only in Xalan, on a
Xalan-specific forum.

In general, every mainstream XSLT processor today builds a tree
representation of the input document in memory. I believe Xalan does parsing
and transformation in parallel, but it still builds the tree. The fact that
the parser and the transformer communicate using SAX is irrelevant - it just
means that the transformer and not the parser is building the tree. (This
isn't totally irrelevant, because the transformer can build a much more
efficient tree knowing it is read-only. But it's still an in-memory tree.)

I can't speak for Xalan, but Saxon users are running transformations up to
200Mb or so without too much trouble, and at speeds up to 10Mb/sec. It
requires a little care in configuring the memory allocation, and in writing
the stylesheet to avoid non-linear constructs, but it's certainly doable.
Beyond that, it probably gets difficult. You don't actually say what you
mean by a "large document". (Personally, I am amazed to see people handling
a 200Mb database as a single in-memory document, but perhaps I'm just
old-fashioned).

If you really need purely serial processing, you might consider STX as an
alternative. However, the existing STX implementations are far less
widely-used or mature than the popular XSLT implementations.

Michael Kay
http://www.saxonica.com/


> -----Original Message-----
> From: Andrzej Jan Taramina [mailto:andrzej@c...] 
> Sent: 03 December 2004 23:45
> To: xml-dev@l...
> Subject:  Incremental transformations with Xalan and 
> performance issues?
> 
> I'm in a situation where I need to parse some large 
> documents, where the 
> first few elements are a preamble with various parameters and 
> the end of the 
> document is a large list of entries.
> 
> Think of a mail merge, where the letter to be sent is defined 
> first in the 
> mail merge xml, followed by numerous recipient entries, 
> something like this:
> 
> <mailmerge>
> 	<letter>
> 		...letter def goes here
> 	<letter>
> 	<recipients>
> 		<recipient>
> 			...recipient data
> 		</recipient>
> 		<recipient>
> 			...recipient data
> 		</recipient>
> 		etc...
> 	</recipients>
> <mailmerge>
> 
> What I was wondering was how Xalan handles the processing of 
> such large 
> documents (say a million recipient entries) when the parser 
> is using SAX?
> 
> More specifically, if I create global variables such as:
> 
> 	<xsl:variable name="letterTemplate" select="/mailmerge/letter"/>
> 
> then later:
> 
> 	<xsl:template match="recipients/recipient>
> 		<!-- process the recipient using $letterTemplate -->
> 	</xsl:template>
> 
> Will the processing be incremental in nature, as SAX events 
> are received by 
> Xalan?  That is, is Xalan smart enough to create the global 
> as soon as it 
> can, followed by processing of each individual recipient as 
> each related SAX 
> event is received?  In that case, having the shared global 
> info early in the 
> document and the large list at the end would probably have beneficial 
> performance implications.
> 
> Or will the whole document have to be instantiated as some 
> sort of internal 
> tree first?
> 
> Hopefully, it's incremental in nature, since otherwise we 
> might blow out 
> memory with such large documents.
> 
> Any insight into the implications of processing such large 
> documents, using 
> globals, xslt stylesheet structure, impact of element ordering in the 
> document and the like would be very much appreciated.
> 
> Thanks!
> 
> 
> 
> 
> Andrzej Jan Taramina
> Chaeron Corporation: Enterprise System Solutions
> http://www.chaeron.com
> 
> 
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
> 
> The list archives are at http://lists.xml.org/archives/xml-dev/
> 
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://www.oasis-open.org/mlmanage/index.php>
> 
> 


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.