[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: RFC: Attributes and XML-RPC

  • From: "Julian Reschke" <reschke@m...>
  • To: "Tyler Baker" <tyler@i...>,"Julian Reschke" <reschke@m...>
  • Date: Sun, 26 Sep 1999 13:45:14 +0200

msxml gzip
I would expect that gzip sees the attribute as 
 
     ="
    name
    "
 
while it sees elements as
 
     <
    name>
    </
    name>
 
Meaning three substrings for attributes, while 4 substrings for elements, where only 1 of them occurs twice.
 
Seems that the corrected benchmark proves that my surprise was not without reason :-) (posting attached again).
-----Original Message-----
From: Tyler Baker [mailto:tyler@i...]
Sent: Sunday, September 26, 1999 10:04 AM
To: Julian Reschke
Cc: Mark Nutter; xml-dev@i...
Subject: Re: RFC: Attributes and XML-RPC

Julian Reschke wrote:
 I find this way surprising. Yes, it's clear that opening and closing tags compress well, but I would still have expected the attribute version to be smaller... 


Each quote (2 of them), comma, equal, and attribute name take up 5-10 bytes for each attribute when mapped to a symbol table, while the element style takes up 2-4 bytes for the start tag and the end tag. The exact number of bytes used for each attribute or element is dependent on the compression algorithm and the number of different attribute or element names used. So it is no surprise to me that using GZIP produces a compressed attribute file that is larger (-:

Tyler  

 

-- 

> -----Original Message-----

> From: Mark Nutter [SMTP:mnutter@f...]

> Sent: Thursday, September 23, 1999 6:34 AM

> To: Blair Murri; xml-dev@i...

> Subject: RE: RFC: Attributes and XML-RPC

>

> Doh! "Code in haste, debug in leisure." Ok, here are the corrected

> results:

>

> $ ./make.pl

> $ ls -l

> total 14206

> -rw-rw-r-- 1 mnutter mnutter 5811846 Sep 23 08:28 attrib.xml

> -rw-rw-r-- 1 mnutter mnutter 8672170 Sep 23 08:28 child.xml

> -rwxr-xr-x 1 mnutter mnutter 999 Sep 23 08:26 make.pl

> -rwxr-xr-x 1 mnutter mnutter 976 Sep 22 13:16 make.pl~

> $ gzip attrib.xml

> $ gzip child.xml

> $ ls -l

> total 1332

> -rw-rw-r-- 1 mnutter mnutter 670757 Sep 23 08:28 attrib.xml.gz

> -rw-rw-r-- 1 mnutter mnutter 681080 Sep 23 08:28 child.xml.gz

> -rwxr-xr-x 1 mnutter mnutter 999 Sep 23 08:26 make.pl

> -rwxr-xr-x 1 mnutter mnutter 976 Sep 22 13:16 make.pl~

>

> So child.xml.gz is slightly larger than attrib.xml.gz (about 1.5%,

> assuming my math isn't as bad as my coding).

>

> That is what I would have expected. But, performance may be more than

> network transport. I've got an app, where, for convenience, I have

> wrapped MS's IE 5 MSXML parser, and I found a significant difference in

> speed by placing stuff that there was only one of as an attribute instead

> of a child element. I don't know why yet, but the attributes where faster

> (I don't know if that was "recording" speed or "retrieval" speed) than the

> child elements using just that one parser. Of course, your mileage may

> vary.

>

Blair L. Murri

Sr. Programmer/etc.

Wavo Corporation


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.