[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Validation vs performance - was Re: Fast text ou


fast sax parser
David Megginson wrote:

> Stephen D. Williams wrote:
>
>> Processing overhead, including the major components of parsing / 
>> object creation / data copies / serialization, is not a 'future 
>> problem'.  It has always been a problem.
>
> We don't know how much and what kind of a problem XML will be until we've
> had time to gain experience -- if we try to optimize too early, we'll 
> end up
> optimizing the wrong thing.

I suppose "early" and "time to gain experience" are relative.

> For example, I set up a test for a customer a while back to see how fast
> Expat could parse documents.  On my 900 MHz Dell notebook, with 256MB RAM
> and Gnome, Mozilla, and XEmacs competing for memory and CPU, Expat could
> parse about 3,000 1K XML documents per second (if memory does not fail 
> me).
>  If I had tried to, say, build DOM trees from that, I expect that the 
> number
> would have fallen into the double digits (in C++) or worse.  In this 
> case,
> obviously, there would be far more to be gained from optimizing the 
> code on
> the other side of the parser (say, by implementing a reusable object 
> pool or
> lazy tree building) than there would be from replacing XML with something
> that parsed faster.

Why make the assumption that "optimizing the code on the other side of 
the parser" is the first or only step?  I posit that this is not the 
best way to proceed and artificially narrows possible solutions.  The 
steps needed to parse XML, such as processing Expat events, cause a 
minimum amount of work.  When that data has been parsed, it must be in a 
usable form and data in a usable form must be serialized at some point.  
The format and the difference between it and memory formats create a 
minimum bound on the theoretical least amount of work.  Other data 
formats have lower minimum bounds.

> ...
>
>> The scarce resource is time.  Anything that eats time is bad.  This 
>> could
>> be bandwidth usage, CPU, memory, or suboptimal communication and 
>> semantic
>>  models.
>
> I have some experience with high-volume, high-speed systems as well.  
> They
> tend to be so finely hand-tuned that they couldn't use *any* 
> off-the-shelf
> format or protocol, much less XML or SOAP -- even HTTP (or in some cases,
> TCP) is out of the question.  These are the kinds of people who will use
> deltas to avoid wasting four bytes on every number.

Of course ;-).
I'm just trying to spread the efficiency to something standard.

> All the best,
>
> David


sdw

-- 
swilliams@h... http://www.hpti.com Per: sdw@l... http://sdw.st
Stephen D. Williams 703-724-0118W 703-995-0407Fax 20147-4622 AIM: sdw

begin:vcard
fn:Stephen Williams
n:Williams;Stephen
email;internet:sdw@l...
tel;work:703-724-0118
tel;fax:703-995-0407
tel;pager:sdwpage@l...
tel;home:703-729-5405
tel;cell:703-371-9362
x-mozilla-html:TRUE
version:2.1
end:vcard


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.