[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: optimization for very large, flat documents
Thanks to everyone who responded. For now I plan to follow Pieter's idea of chunking the data into manageable pieces (16-64 MB). Then I'm going to look into Michael's suggestions about STX (unfortunately, not yet a W3C recommendation and thus not widely implemented) and XQuery. For anyone interested in some numbers, I've split each of my 2 large files (613 MB and 656 MB) into subfiles of 16 K independent entries (which vary in size), yielding sets of 25 and 37 subfiles (of approx. 25 MB and 17 MB each, respectively). I process them by running Saxon 8.2 from the command line (with an -Xmx value of 8 times the file size) on a Sun UltraSPARC with 2 GB of real memory. The set of 37 17 MB XML subfiles are processed with a slightly simpler stylesheet, and take about 1:15 (minutes:seconds) each; the set of 25 25 MB XML subfiles use 1 document() call per entry to/from a servlet on a different host and take about 8 minutes each. My next step is to use Saxon's profiling features to find out where I can improve my stylesheet's performance. Thanks again to everyone on xsl-list for all your help! -- Kevin Rodgers
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|