[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Pushing all the buttons


converting bytes to characters

> time invested in XML parsing would pay dividends.

Yes, but it's not just an XML parsing problem.  You've got to look at 
the whole process of going the block of data you get from the kernel to 
the in-memory application-specific Java objects (and vice-versa).  In a 
typical Java implementation today, this involves numerous layers:

1. The data gets copied from the buffer where the kernel put it into 
memory under control of the Java runtime.

2. The data gets copied through a buffering stage by a BufferedInputStream.

3. The bytes get turned into characters using an InputStreamReader.

4. The XML parser processes the characters and delivers SAX events.

5. The XML data binding tool does its thing and turns those in events 
into application-specific objects.

With Sun's Fast Web Services stuff, they are going directly from a 
sequence of bytes to application-specific objects, cutting out at least 
two of the layers in the XML-based implementation.  I am quite willing 
to believe they can get an order of magnitude improvement.

However, it is also possible to apply the same approach to XML.  I 
believe this would give a substantial performance improvement.  The 
basic idea is you would have a data binding tool that compiles a schema 
into something that would operate not on SAX events but directly on the 
bytes representing the XML document.

To make this practical a little XML subsetting is required.  First, I 
think you would need to do what the SOAP folks have done and disallow 
DTDs; handling entities would make this approach very difficult. 
Second, you really need to fix on a single encoding.  I think UTF-8 is 
the obvious choice for Web services.  A single encoding allows you to 
cut out a whole layer of your processing stack.  Instead of converting 
bytes to characters and then parsing those characters into objects, you 
can parse the bytes directly into objects.  For maximum 
interoperability, you could use the optimized code-path when  the XML 
keeps to the subset and fall back to the general but slow code-path when 
it doesn't.

I think the appropriate measure of the value of Sun's Fast Web Services 
approach is what performance improvement it could offer over the sort of 
approach I've described.

James



PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.