Re: Visiting your cake and eating it too (was Stupid Question
At 20:16 06/03/2002 +1100, Rick Jelliffe wrote: >From: "Sean McGrath" <sean.mcgrath@p...> > > > > At 18:32 06/03/2002 +1100, Rick Jelliffe wrote: > > > > >There is a case to be made that, for implementability reasons, it is > actually > > >good to bundle together as many orthogonal functions that can act as > > >visitors on the same traversal of the infoset. > > > This makes no sense to me. I must be missing something. What do you > > mean "for implementability reasons". My gut tells me quite the > > opposite! > >Oops, I meant "for efficiency reasons". Rather than traversing an infoset >many times, it can be more efficient if different functions can be performed >in a single pass through a document. * Rick, I agree with you most of the time but I have to totally disagree with this line of thinking. (This started out as a short reply and sort of grew and grew. Sorry.) <Rant> Efficiency of XML processing is classic premature optimization territory. I''ve learned from bitter experience that doing XML processing monolithically for efficiency reasons is almost always a bad idea. Its bad design, leads to poor evolvability and - more often not - is based on a false impression of where the bottleknecks really are. I find again and again that if you design and implement loosely coupled XML systems - ignoring efficiency concerns - efficiency has a way of sorting itself out without adversely impacting the design. Two examples germane to this discussion. In XPipe, We are prototyping some XPipe compilers that are looking very promising. High efficiency execution but loose coupling of processing types. Also in XPipe, we are working on some P2P execution environments (XGrids) that allow multiple processors to cooperate to perform XML processing. This exhibits efficiencies that will bring tears to your eyes - yet not at the expense of monolithic designs. XGrid shows up one significant foible in software developers - a predisposition to thinking in von Neumann architecture terms. i.e. if it takes 1 minute to process 1 XML document and I have 100,000 documents, then the processing will take 100,000 minutes = about 70 days. Now, in many cases, the processing is trivially parallelizable. In the truly trivial case where there are no interdependancies between the XML instances, the total processing time can get as close to 1 minute as you like with the aid of multiple processors. With the aid of some judicious domain decomposition, we find that *most* XML processing can be made trivially parallelizable. Optimizing the 1 minute figure for processing a single XML instance - the end-to-end "make time" of a single XML document only makes sense in near-realtime scenarios. For everything else, XGrid style distributed computing beats will beat hand-crafted "optimized" single pass systems hands down. Both in throughput terms and in evolvability terms. Any schema language, query language or any other XML technology that justifies the complexity that is concomitant with monolithic design on efficiency grounds should be treated skeptically. Remember this - complexity is a well established business model. Justifying complexity on the grounds of efficiency plays on the collective weakness of us practictioners in the software engineering profession to spot the baselessness of most "for efficiency" pitches. </Rant> Sean  xpipe.sourceforge.net
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format