|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: XML tools and big documents
Don Park wrote: > >As for the memory issue, I have thought about some sort of LZW compression > of all > >of the text in a document tree. This would save a lot of memory, but may > slow > >down building the DOM tree a bit. Any ideas on this? > > Your performance will suffer and memory problem still remains. > > Don Well the memory problem will remain but it could be reduced significantly for large redundant documents. Some people have claimed they get 97% compression of some XML documents when using popular compression utilities like Winzip. Reducing memory overhead with Names can be done at the parser level and actually is implemented in some fashion for every major parser I know of. As for character content, the idea centers largely around each text node only allocating a new String if the application requests it. The String however is created by looking up all of the character fragments stored in some sort of symbol table and then parsing the String. Then the String would be cached. Nevertheless if the text node is mutated in any way, the String reference is then set to null. On second thought this may not degrade performance too much as you will be getting the added benefit of only needing to allocate memory to store an integer array (the sequence of symbols used to parse the string from the symbol table) instead of a using a String which allocates two objects, the String object itself, and the character array contained within it. Of course this optimization is Java specific and in languages like C++ or Eiffel where heap based objects are not as expensive to deal with, this may be counter-productive. Who knows it might be counter-productive in Java. I guess there is only one way to find out unless someone has already tried this and has some insight they can lend. Most parsers and parser interfaces like SAX present the character data as characters and not as Strings. So building the DOM tree without ever needing to create any new String objects initially is very much doable. I guess the real question is: should the DOM even be used for multi-megabyte documents in the first place. Initially I thought of XML as something that would be used for two main purposes: EDI like web transactions and as a replacement for HTML. It seems like people now are using it for so many other things, many of which may not be suitable for XML's abilities. I guess the responsibility of XML tools developers is to provide the most abstract functionality possible so people can do many more things with XML than what it was intended for. Nevertheless, I think it is also a responsibility not to sell XML as the do-all solution of every computing problem known to man. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








