[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Fast text output from SAX?


xbis example
Bob Wyman wrote:

>Dennis Sosnoski wrote:
>  
>
>>I think this *would* be a fair comparison test for 
>>the ASN.1 "fast infoset" approach
>>    
>>
>	Yes. You are correct. Both the ASN.1-based X.finfo and XBIS
>are binary encodings for "schema-free" XML documents. Thus, it makes
>sense to compare them to each other. It wouldn't be fair to compare
>XBIS to a "schema-based" binary encoding since the schema-based system
>would almost surely blow XBIS away in both compactness as well as
>parsing speed. 
>	It is important to note that by saying that well-implemented
>schema based approaches would be faster/smaller/etc. than XBIS, I'm
>not saying anything negative about XBIS. We're talking here about two
>classes of solution. (schema-based and schema-free) Each class is more
>or less appropriate and useful in different contexts and each has
>qualities that the other can't match. An orange should not be sorry
>that it is less crunchy than an apple.
>
I absolutely understand the distinction. I don't think the differences 
are likely to be as large as you seem to imply, though. A lot depends on 
the type of data used as the text content of the documents. If it's 
heavily binarizable types you may see a considerable size advantage for 
the schema-based approach, but I'd suspect this is something like a 
factor of 1.5-2x at most. Consider how much space you can save using a 
binary representation of an integer value vs. text:

  1 digit -> 1 byte, 1 byte
  2 digit -> 1 byte, 2 byte
  3 digit -> 2 byte, 3 byte
  4 digit -> 2 byte, 4 byte
  5 digit -> 3 byte, 5 byte
  etc.

This assumes you're using variable-length encoding of the binary values, 
7 bits of binary per byte of representation (rounding up slightly on the 
crossovers for the binary encoding). And I think that's one of the 
*best* cases for a binary representation. It'd be a little faster to 
reconstruct a binary value from the variable-length encoding than it 
would to convert the actual digits, but probably not by a lot. And for 
text values, a schema-based approach would offer no benefits at all.

One of the problems I have with Sun's "Fast Web Services" example is 
that they basically hardwired stuff in at a low level for their 
schema-based implementation, then compare that with the full text 
system. I haven't actually tried it out, but I suspect that if I 
combined a light-weight SOAP implementation I've built around my JiBX 
data binding framework with an XBIS transport layer I'd get better 
performance than the Sun team (maybe I should try it - think Sun would 
publish an article on "Fast-er Web Services" if I'm right? :-) ). That 
doesn't mean that XBIS is a "better" transport than the schema-based 
approaches, though, only that there are many factors involved in 
performance when you look at complex systems like Web services 
implementations, and the actual XML handling is only one of them.

  - Dennis

-- 
Dennis M. Sosnoski
Enterprise Java, XML, and Web Services
Training and Consulting
http://www.sosnoski.com
Redmond, WA  425.885.7197



PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.