|
[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: optimization for very large, flat documents
> I'm trying to process a very large (600 MB) flat XML document, a > bibliography where each of the 400,000 entries is completely > independent > of the others. According to the Saxon web site and mailing > list, it'll > take approx. 5-10 times that (3 GB) to hold the document tree > in memory, > which is impractical. The Saxon mailing list also has some tips about > how to accomplish this, but my question is: Why doesn't XSLT provide a > way to specify that a matched node can be processed > independently of its > predecessor and successor siblings? Alternatively, couldn't an XSLT > processor infer that from the complete absence of XPath > expressions that > refer to predecessor and successor siblings? I think the reason that XSLT vendors have not tried this approach is: (a) there are rather few stylesheets where the technique works, and can be seen statically to work. It's not enough that all path expressions should select downwards: there must be no absolute path expressions, no global variables that select from the initial context node, no keys, and probably quite a few other conditions besides. (b) for such stylesheets, a completely different run-time approach is needed: effectively, a different XSLT processor. I think that in practice if you want to do serial transformation then a functional language is not the right answer: if you can only look at each piece of input data once, then you need the ability to remember what you have seen, so you need a procedural language with updatable memory. That's why STX was invented. However, I think there is scope for someone to package up the idea of running an XSLT transform on each "record" in a large file, and then recombining the results. Michael Kay http://www.saxonica.com/
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|

Cart








