[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: almost four years ago....
At 4:09 PM +0100 6/16/01, Alaric Snell wrote: >This is easy to do. GZIP is massively crippled by having no information about >the structure of the file - it's just a string of bytes that it has to make >some assumptions about the probable structure of with regards to frequency >distributions that won't even apply very well to XML; it's trivial to write >something that compresses better, especially if you use gzip for >what it's best >at (the CDATA) and handle the <> bits yourself. > I've heard that one before too. In practice, it isn't nearly as easy as people think it is. After a great deal of effort, you may be be able to shrink 1% or 2% more on some files. However, most people who try this end up producing something that is noticeably larger than gzip. Of course you could use a better general purpose compression algorithm. bzip can grab you 5% or so a lot of the time, though it isn't as widely supported. Frankly, if you can't provide at least a 10% improvement then it's not worth my time to worry about. Better than 10% smaller, I don't think you can do without a lossy algorithm. You simply run into the limits of information theory. >> 3. Human legible/human editable data doesn't matter. > >Indeed, we must never use image files, filesystems, or gzip - they'll never >take off :-) > This is a canard. Nobody uses XML for this stuff anyway. >> All three beliefs have been empirically proven false time and time >> again. > >Chuckle! > Hey, don't let me stop you from trying! I could be wrong, in which case we can all benefit from your efforts. But I think that if you're really smart and try really hard and devote months of your life to this problem, you aren't even going to get a 10% improvement over gzip. (You might not get any improvement at all.) And even if you do get that 10% improvement, I suspect you'll discover you're system is so inconvenient compared to plain or gzipped XML that nobody will use it. But after all, it's your life. If you've got the time to spend on this, feel free to try. I'm just afraid you'll get the same results as the last two dozen people who tried this. -- +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@m... | Writer/Programmer | +-----------------------+------------------------+-------------------+ | The XML Bible (IDG Books, 1999) | | http://metalab.unc.edu/xml/books/bible/ | | http://www.amazon.com/exec/obidos/ISBN=0764532367/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://metalab.unc.edu/javafaq/ | | Read Cafe con Leche for XML News: http://metalab.unc.edu/xml/ | +----------------------------------+---------------------------------+
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|