|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Data streams
This also speaks to the somewhat verbose form of XML that Office might be producing. It's certainly no surprise to anyone that the data was larger and compressed differently in XML than CSV. Especially not with the example you proposed. I think your conclusion about CSV effectiveness is short-sighted. While CSV can certainly be "bit stingy" it often comes at the considerable cost of being brittle. Without effective metadata those numbers just become gibberish. While it's fair to say an XML file may be larger it does so in a remarkably self-documenting way. Where's the balance to be struck? In lightweight CSV that's fraught with processing perils? Or in methodically documented XML that simply takes a few cycles longer? CPU and Disk is cheap, programming time and budget to work around crappy, brittle data isn't. It might be a more interesting experiment to discuss using more purpose-built XML schemas. Doing a better job of describing the data in with XML without being so verbose. While Office may not offer it at this point that doesn't preclude others from doing a better job of it. -Bill Kearney Syndic8.com ----- Original Message ----- From: "Stephen E. Beller" <sbeller@n...> > I tried Steven's experiment from a different angle. I filled an Excel XP > spreadsheet with a single-digit number, saved it in both XML and in a > comma-delimited text file (CSV). I then compressed both with WinZip and then > opened both with Excel. Here's what I found: > > The XML file was 840MB, the CSV 34MB -- a 2,500% difference > Compressed, the XML file was 2.5MB, the CSV 0.00015MB (150KB) -- a 1,670% > difference. > > Equally dramatic is the time it took to uncompress and render the files as > an Excel spreadsheet: It took about 20 minutes with the XML file; the CSV > took 1 minute -- a 2,000% difference. > > My conclusion is that delimited text files handle large arrays of data more > efficiently. This stems, in part, from the fact that a comma delimiter (or > some other single character) carries much less overhead than tags; CSV > requires only a comma, while XML requires a minimum of 5 characters (<></>) > -- that's makes CSV a minimum of 500% more efficient ... and when you add > the semantic labels and attributes to the tags, and the size of XML > increases dramatically. > > Note, however, that when dealing with large blocks of text instead of > numbers (or small text strings), the difference between XML and delimited > text files is considerably less. > > Of course, XML offers benefits that a plain data array in a CSV file does > not, such as attribute definitions and hierarchical associations between the > data (if that's necessary) ... even though there are ways comma-delimited > data can be used to perform the same functions of XML when rendering > serialized data arrays as charts.
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








