|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Validation vs performance - was Re: Fast text ou
David Megginson wrote: > Stephen D. Williams wrote: > >> Processing overhead, including the major components of parsing / >> object creation / data copies / serialization, is not a 'future >> problem'. It has always been a problem. > > We don't know how much and what kind of a problem XML will be until we've > had time to gain experience -- if we try to optimize too early, we'll > end up > optimizing the wrong thing. I suppose "early" and "time to gain experience" are relative. > For example, I set up a test for a customer a while back to see how fast > Expat could parse documents. On my 900 MHz Dell notebook, with 256MB RAM > and Gnome, Mozilla, and XEmacs competing for memory and CPU, Expat could > parse about 3,000 1K XML documents per second (if memory does not fail > me). > If I had tried to, say, build DOM trees from that, I expect that the > number > would have fallen into the double digits (in C++) or worse. In this > case, > obviously, there would be far more to be gained from optimizing the > code on > the other side of the parser (say, by implementing a reusable object > pool or > lazy tree building) than there would be from replacing XML with something > that parsed faster. Why make the assumption that "optimizing the code on the other side of the parser" is the first or only step? I posit that this is not the best way to proceed and artificially narrows possible solutions. The steps needed to parse XML, such as processing Expat events, cause a minimum amount of work. When that data has been parsed, it must be in a usable form and data in a usable form must be serialized at some point. The format and the difference between it and memory formats create a minimum bound on the theoretical least amount of work. Other data formats have lower minimum bounds. > ... > >> The scarce resource is time. Anything that eats time is bad. This >> could >> be bandwidth usage, CPU, memory, or suboptimal communication and >> semantic >> models. > > I have some experience with high-volume, high-speed systems as well. > They > tend to be so finely hand-tuned that they couldn't use *any* > off-the-shelf > format or protocol, much less XML or SOAP -- even HTTP (or in some cases, > TCP) is out of the question. These are the kinds of people who will use > deltas to avoid wasting four bytes on every number. Of course ;-). I'm just trying to spread the efficiency to something standard. > All the best, > > David sdw -- swilliams@h... http://www.hpti.com Per: sdw@l... http://sdw.st Stephen D. Williams 703-724-0118W 703-995-0407Fax 20147-4622 AIM: sdw begin:vcard fn:Stephen Williams n:Williams;Stephen email;internet:sdw@l... tel;work:703-724-0118 tel;fax:703-995-0407 tel;pager:sdwpage@l... tel;home:703-729-5405 tel;cell:703-371-9362 x-mozilla-html:TRUE version:2.1 end:vcard
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








