[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Use cases for parsing efficiency (was Re: Parsing
Mike Champion wrote: > My day job colleagues changed my mind by pointing out that in > industrial- strength, native XML processing environments, nothing much > is happening besides XML being parsed, processed (stored, queried, > transformed) and serialized again. That's quite a lot happening (other than parsing). I mean what else /could/ happen? > The better code gets and the more > efficient customers get in using the code (e.g. building DB indexes and > optimizing queries, in our case),the more and more that > parse/serialization step becomes a bottleneck. I've heard the same > thing from industrial-strength SOAP developers -- as the volume of > messages goes up and processing resources get dedicated to XML (i.e., no > application logic or DB access happening on the machine parsing, > processing, serializing the XML), then the bottlenecks in XML parsing > become increasingly apparent. Sure, Father Moore will ultimately solve > this problem with faster hardware, but that's not a great marketing > pitch for software people. So, following David, in one hundred secs, you spend one second parsing XML and 99 seconds doing somehting else. Suppose you get a tenfold speedup doing something else (cigars all round). You're down to 11 seconds. Parsing is approaching 10%. A tenfold speedup in parsing only saves you 1/2 a second, or approaching 9%, now. And because it's /still/ the wrong side of the 80/20 split, it's /still/ not place to be looking, unless you know that processing time is evenly distributed through the code base (but that would be rare, and probably worth writing a paper on). The same reasoning applies at 10% time to begin with. > So, I'm not at all sure that standardization of efficient infoset > serializations is something that the W3C or anyone else should undertake > at this time. But I don't want to see the W3C preclude it (or XML geeks > to conclude that it is evil) either. XML processing is moving more and > more into the core of real enterprises. We'll see the previous situation > where XML is just a transient serialization format between DBs and > applications turned around, so that most of the components of a > processing pipeline are taking XML in, storing/processing it natively, > and putting XML out. In that scenario, lots of people are going to be > looking for ways to reduce the parsing bottlenecks ... Performance arguments mean nothing without measurement. And even if parsing is a problem, it does not follow that XML requries subsetting. For example, you might be better off with an 'enterpise class parser', or an 'enterpise class datamodel' than with an 'industrial class subset'. I'd like to see some some numbers on parsing concerns, so we could figure out a) is there a problem, b) where is a problem, c) what's the solution. Bill de hÓra
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|