[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Some clarificatiosn -- RE: [Question] How to do incremental parsi ng?
Dear All, It's amazing to get some many replies when I came to work this morning. Sorry I cannot make replies individually. Here are some clarifications: * I was wrong in saying that SAX reads the whole doc in memory. I meant to say that about DOM lazy evaluation. * A DOM Java parser is eventually I am looking for. The problem of SAX is that you will have to write all those tedious "startElement", "endElement" stuff every time for each XML file of a different format, and the parsing never stops! Perl modules, or another scripting language like OmniMark is not an option because they are not in Java. Putting an XML doc into a RDBMS is not an option either, because it is only an awkward temp solution. Guy Murphy mentioned the possibility of "don't use XML", but a generic XML parser is what I am looking for, otherwise it's gonna be a nightmare each time when an large XML file is to be dealt with. Some mentioned the row processing feature of dom4j, kXML, SAXON, minidom, easydom, and Orchard. Do they read the whole doc into memory before parsing anyway, like the DOM lazy eval? If these parsers are based on xerces SAX, the chances are the whole doc is read into the memory. An incremental SAX parser such as the suggested MSXML SAX parser seems to be the closest idea, but an incremental DOM parser has to be built upon it. Ajay, do you have a quick reference on MSXML? * What is "persistent DOM"? Thanks a lot. -- Mousheng Xu The information contained in this email is intended for the personal and confidential use of the addressee only. It may also be privileged information. If you are not the intended recipient then you are hereby notified that you have received this document in error and that any review, distribution or copying of this document is strictly prohibited. If you have received this communication in error, please notify Celltech Group immediately on: +44 (0)1753 534655, or email 'is@c...' Celltech Group plc 216 Bath Road, Slough, SL1 4EN, Berkshire, UK Registered Office as above. Registered in England No. 2159282
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|