[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Fast text output from SAX?


sax implementation comparisons
Thomas B. Passin wrote:

>Dennis Sosnoski wrote:
>  
>
>>The problem, which I've expressed more than once, is to compare the 
>>performance for the alternatives of using text XML vs. some post-parse 
>>representation of XML documents. For the reasons given in my earlier 
>>email I'm chosing to base my timing comparisons on the parse event 
>>stream. 
>>    
>>
>
>But presumably the alternative "quasi-xml" you will be testing will not 
>likely be producing SAX events, but instead some proprietary parse 
>system instead.  Do you mean that you want to write a proprietary - to - 
>textual xml file and compare that with writing a SAX-to-textual xml file?
>
The focus of what I'm doing is looking at general XML document 
interchange performance. The XBIS results I posted prior to the W3C 
workshop last fall (http://xbis.sourceforge.net/performance.html) are 
from a similar set of tests, which I'm now extending in a couple of ways.

As I see it the most useful comparisons to be made for general XML 
document interchange performance are (1) how much time is required to 
convert an incoming document to a form usable by the application, (2) 
how much time is required to convert from the internal form used by the 
application to a form that's serialized for transmission, and (3) what's 
the size of the serialized form. For the specific tests I'm running now 
I want to compare XBIS, text, and zipped text.

Obviously I can't test every possible internal form that might be used 
by an application. However, the vast majority of XML document processing 
in Java is currently built on the event streams produced by SAX parsers. 
Any general XML format should be convertible to and from an event stream 
of this type, and in practice that's the way any alternative general 
formats are likely to be used (at least in the near term).

This type of testing is admittedly only relevant for general-purpose 
formats. Schema-specific formats (such as the ASN.1 schema 
representation) get into a whole separate set of issues and should be 
compared differently. You *can* compare time and space performance for 
schema-specific formats vs. text (as Sun did in their "Fast Web 
Services" paper) or alternatives such as XBIS, but it's in some sense an 
apples-to-oranges comparison. Schema-specific formats are best suited to 
use with data binding type approaches, where the application doesn't 
really see XML as such, only objects that are mapped to XML components. 
I would expect that in these circumstances schema-specific formats would 
always be able to deliver better performance than general-purpose 
formats such as XBIS, let alone text. However, schema-specific formats 
are only usable when the documents being exchanged are known to follow 
those particular schemas. Even then there can be problems - since the 
schema-specific formats do not preserve raw text they generally won't be 
usable with signing and such, for instance. General-purpose formats such 
as XBIS would not have this problem.

On the other hand, I think this *would* be a fair comparison test for 
the ASN.1 "fast infoset" approach that's been mentioned in related 
emails. If there's an implementation of this available for Java (that 
goes to and from SAX2) I'd be very interested in including it in my tests.

  - Dennis

-- 
Dennis M. Sosnoski
Enterprise Java, XML, and Web Services
Training and Consulting
http://www.sosnoski.com
Redmond, WA  425.885.7197



PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.