Re: Free Tool for Efficient XML Data Compression
Philip Boutros > > > Thomas B. Passin wrote: > > > I also took a 2-column Word97 document - 63.5K - > > and opened it with Abiword (www.abisource.com) > > then saved it. Abiword uses an XML file format as > > its native format. Abiword XML file size: 48.8K. > > While I would love to chime in on the whole compressed XML discussion (my > guess is word dictionaries and skip lists should slaughter gzip in terms of > compression size) I would like to address this statement in particular. > > 1. > Comparing Word97's file size to that of the current version of Abiword is a > ludicrous exercise. I am very familiar with the Microsoft Word file format > (no, I don't work for Microsoft and never have) and while it contains a > number of inefficiencies and chunks of legacy garbage, given how much it > encodes it is reasonably efficient for large documents. I can name at least > a hundred features (styles, page layout, frames, borders, backgrounds, > graphics, fields, properties, etc.) that Word97 must deal with in its file > format that Abiword's format does not address. In fact, given how little > Abiword encodes, I was surprised that Abiword wasn't 10 times as efficient. > See #2 for a tirade about that. > <snip/> Well, of course I know Word documents include a ton of stuff that, say, Abiword files don't. And this particular file doesn't even have any VBA macros of my own in it. And I'm not arguing for Abiword's file format, either. In this case, though, my document doesn't need the rest of that stuff Word includes. So doing this conversion gave me some rough way to compare sizes when the two documents were typical of the type I often use. For an XML document, you could have argued that some other format doesn't need end tags so of course XML would be bigger. That's not the point. The point is that - I think it will turn out this way - for many actual cases, the supposed size disadvantage of an XML document will be relatively small or non-existent. Of course, the XML standard was developed under the guideline that "terseness ... is of minimal importance", so if file size alone is going to be the driver, XML might not be a favorable candidate. Tom Passin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@i... the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format