RE: Data streams
In consideration of Elliotte's reply, I went back and looked at the XML file Excel generated. Here's what I found ... Every one of the XML data elements had this tagging structure: <Row> <Cell><Data ss:Type="Number">1</Data></Cell> </Row> In contrast, the CSV had this structure: 1, That's a 50 characters to 1 difference for each data element. I doubt that all those XML tags are necessary if you're rendering the data in something other than a spreadsheet. But if you are planning to use a spreadsheet, then the 50 to 1 ratio is valid, it seems to me. Does anyone know what a reasonable tagging equivalent might be if you're, say, distributing a data array in XML for SVG rendering? It might be fewer than 50, but it will still be a lot more than 1, especially if you have data type attributes. In addition, the XML doc had about 50 lines of additional tags at the beginning and end of the file, which was Microsoft Office metadata not in the CSV. While some are certainly necessary for a valid XML doc, I'm sure some are superfluous. But even if you subtracted all those lines from the total characters, it had almost no affect on the size comparisons when you're dealing with a large data array. So, this benchmark test still points to a huge difference in file size and in unzipping and parsing time when you compare a large data array in CSV compared to XML. Steve -----Original Message----- From: Elliotte Harold [mailto:elharo@m...] Sent: Monday, December 06, 2004 2:43 PM To: Stephen E. Beller Cc: xml-dev@l... Subject: Re: Data streams Stephen E. Beller wrote: > I tried Steven's experiment from a different angle. I filled an Excel XP > spreadsheet with a single-digit number, saved it in both XML and in a > comma-delimited text file (CSV). I then compressed both with WinZip and then > opened both with Excel. Here's what I found: That sounds like a bad test. The XML file contains a lot more information than the CSV file. Specifically it contains a lot of Microsoft Office metadata about things like the name of the person who created the file that are not in the CSV file. There is information in the XML file that is not present in the CSV file. -- Elliotte Rusty Harold elharo@m... XML in a Nutshell 3rd Edition Just Published! http://www.cafeconleche.org/books/xian3/ http://www.amazon.com/exec/obidos/ISBN=0596007647/cafeaulaitA/ref=nosim
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format