[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: No XML Binaries? Buy Hardware

  • From: noah_mendelsohn@u...
  • To: Elliotte Harold <elharo@m...>
  • Date: Fri, 23 Feb 2007 16:31:39 -0500

Re:  No XML Binaries? Buy Hardware
Elliotte Harold writes:

> I don't think we've hit the limits of parser performance yet,

I think that asking "what are the theoretical limits" is very important, 
and it's not a question I've seen discussed often enough.  From our paper 
[1]:

> No parser can process input faster than its supporting hardware
> accesses data, but the additional cost of parsing and 
> validation should be minimized. On a 1 GHz Pentium processor a 
> simple character-scanning loop runs at about 100 Mbytes/second,
> which is 10 cycles/byte. 

> [..]

> On the tests reported in this paper, using the business object 
> API typical of Web Services applications, XML Screamer parses 
> and schema-validates XML at between 23 and 46 Mbytes/sec/GHz; 
> XML Screamer can thus process XML at speeds of roughly 100–200 
> Mbytes/sec on the 4 GHz processors now becoming available. 

> [...]

> Using its business object APIs, XML Screamer scans, parses, 
> validates and deserializes at between 22% and 44% of the tested
> processor's raw character scanning speed. Except insofar as 
> ways can be found to use such processors more efficiently, e.g.
> by exploiting hardware string test instructions or on chip SIMD
> accelerators, gains from further tuning or alternative 
> approaches are likely to be modest. XML Screamer's performance 
> is probably not far from the maximum achievable.

In short, we observed that to check well formedness, a parser must at 
least touch each input character.  You can benchmark various 
processor/memory combinations using their most optimized forms of 
character and string comparison and find out how fast they can inspect 
each byte of an input buffer, doing the sorts of character comparisons 
necessary for well formedness checking.   There may be ways to do better 
than we did on particular processors, but I think it's interesting that 
one can set a pretty good bound on how fast XML processing can go. 
Furthermore, I think our work shows that it is possible to get not to far 
from that bound, for some definition of "not to far" :-).

Noah

[1] http://www2006.org/programme/item.php?id=5011

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------






[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.