|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Use cases for parsing efficiency (was Re: Parsingefficiency? -
On Wed, 26 Feb 2003 07:44:31 -0500, David Megginson <david@m...> wrote: > > In the past, I've observed that actual XML parsing generally accounts > for under 1% of a batch application's running time (much less, if > you're building a big object tree or doing any database access). That > means that if you speed up the XML parsing by 10%, you might have sped > up your application by less than 0.1% (or realistically, not at all, > if the parser was already idling waiting for data over the network). I for one wouldn't dispute that most people on this list have had similar experiences, and we all know that you won't get much real speedup by optimizing non-bottlenecks. As a matter of fact, until a few months ago I was as much a scoffer at the arguments that Al and Robin raise as any of you. My day job colleagues changed my mind by pointing out that in industrial- strength, native XML processing environments, nothing much is happening besides XML being parsed, processed (stored, queried, transformed) and serialized again. The better code gets and the more efficient customers get in using the code (e.g. building DB indexes and optimizing queries, in our case),the more and more that parse/serialization step becomes a bottleneck. I've heard the same thing from industrial-strength SOAP developers -- as the volume of messages goes up and processing resources get dedicated to XML (i.e., no application logic or DB access happening on the machine parsing, processing, serializing the XML), then the bottlenecks in XML parsing become increasingly apparent. Sure, Father Moore will ultimately solve this problem with faster hardware, but that's not a great marketing pitch for software people. So why should you all care about standardization of processing pipelines that are generally *internal* to products? I'm not completely sure you should. One might argue that you as customers of / developers for enterprise-class XML processing software may wish to tap into the pipelines at a lower level, e.g. grab the rawest Infoset data out of a DBMS before it gets sanitized and standardized by the API level, or insert your own specialized SOAP processors (e.g. to support a new choroegraphy standard) deep into IBM or Microsoft's architecture. If the vendors all go their separate ways on efficient infoset representations, we're back to the Bad Old Days (e.g., where SQL is today) in which "standards" are more or less conceptual frameworks rather than the basis for interoperable code, at least at the down-n-dirty level. Another argument for standardization of this stuff is that -- as Robin points out repeatedly -- lots and lots of wheels are being reinvented daily. There's something to be said for cooperation and joint research / development / testing under the aegis of a standards body (perhaps like XQuery, which is also more of a joint research project than a standardization of existing pratice). So, I'm not at all sure that standardization of efficient infoset serializations is something that the W3C or anyone else should undertake at this time. But I don't want to see the W3C preclude it (or XML geeks to conclude that it is evil) either. XML processing is moving more and more into the core of real enterprises. We'll see the previous situation where XML is just a transient serialization format between DBs and applications turned around, so that most of the components of a processing pipeline are taking XML in, storing/processing it natively, and putting XML out. In that scenario, lots of people are going to be looking for ways to reduce the parsing bottlenecks ... either by subsetting (entity expansion is a notorious bottleneck in high-performance XML processors, to the point where the SOAP community simply refuses to do it), by exploring "binary" serializations, or both. I don't want to see this "pollute" document XML, but some of the assumptions of what is universal across document and data XML will probably have to change to make this happen without a major fork.
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








