[Home] [By Thread] [By Date] [Recent Entries]
Glad to see such an, err, "enthusiastic" response. As the web page says, I'd intended to update this long ago. I've been sidetracked but will try to get back to it later this month, when I want to compare document size and processing speed for collections of documents using a common schema. I'll also try to find the fastest available SAX2 parser to use as an input-only comparison. I'd suggest you don't waste time trying Java serialized versions of DOM - the results are horrible. You can see some at the bottom of my document models benchmarks page, at http://www.sosnoski.com/opensrc/xmlbench/results.html. The main problem is that all the document representations (DOM, JDOM, dom4j, etc.) are tree structures of generally small objects, while Java serialization is optimized for graph structures. It uses (fairly large) handles for each object, and actually includes the handles in the encoding (as opposed to just making the values sequential and implicit). This adds a lot of bloat - Java serialized Xerces DOM ran about twice the size of the text documents in the tests I've run. - Dennis Alaric Snell wrote: >http://www.sosnoski.com/opensrc/xmls/results.html > > - uuugghh, I just ejaculated (sorry, ladies)! > >That's the kind of experiment I was planning to perform this weekend, and the >kinds of results I imagined getting. > >The only difference is that I'd introduce gzipped versions of the text, >serialised DOM tree, and XMLS data, including the time taken to deflate and >inflate the data. Just since people keep raising gzipped text. > >I'll try and do that this weekend... > >ABS >
|

Cart



