Subject: Re: Processing large XML Documents [> 50MB]
From: Michael Ludwig <milu71@xxxxxx>
Date: Tue, 23 Feb 2010 00:51:01 +0100
|
Ramkumar Menon schrieb am 22.02.2010 um 15:40:47 (-0800):
> We have a need to process XML Documents that could be over 50 megs in
> size.
>
> Due to the huge size of the document, XSLT is getting tough, with the
> environment we are running in.
>
> Basically, the nature of the data procesing are
>
> a) assemble around 30-40 XML documents [each with a common header and
> its own lines] into one single XML document, with the common header
> and all the lines b) Update the assembled document in specific
> locations c) generate multiple XML document fragments from the huge
> XML document based on query criteria. Each XML frgment is created by
> mapping specific fields in the big document. Each document is created
> for a specific key element value in the huge document.
>
> Am puzzled how to handle this one efficiently.
* Saxon streaming extension
http://www.saxonica.com/documentation/sourcedocs/serial.html
* an XML database and XQuery (Berkeley DB XML, eXist, MarkLogic, others)
* SAX filters (might easily get way too complicated, or even impossible)
|