[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Streaming XML (WAS: More on taming SAX (was Re:
David, Thanks for your answer. > Unfortunately, the ones who do not call in the > consultants simply conclude that XML is too slow and abandon it > completely. I find this REALLY, REALLY unfortunate. Here we are, back in the early days of SQL, where people didn't believe that they can get decent performance unless they hard code their files/indexes management by hand, or in the early days of Java, where you couldn't get a decent performance unless you hard code your memory management by hand..... that's sad... How is that some people cannot trust that XML/XSLT/XQuery performance WILL come? Performance always comes when there is a need for it. Moreover, even if the performance would NOT be comparable (which I doubt anyway but...), the difference in productivity is SO big..... Do you know how many servers can be bought with the equivalent work of those XML consultants ?? Not talking about the fact that those hard coded solutions will break with every single change (e..g. think of schema evolution: what was streamable yesterday is not streamable today, so here you go, call back your XML consultants and rewrite your SAX application...) > So far, no one has shown me a DOM, XSLT, or XQuery-based > app that is not at least an order of magnitude or two slower than a > hand-rolled streaming application, and that's not even considering the > memory overhead. This I don't believe so easily. I worked for two companies recently (BEA and Oracle) and they both have very, very decent XQuery implementations. In BEA we put all our efforts into maximizing streaming, exactly to solve the use cases you are talking about: thousands of transformations per second per server. As long as I was there I didn't hear too many complaints about the performance of the XQuery engine. So I have a hard time to believe that the XQuery's performance is the big problem, or that performance will remain a big problem for a long time. > because (as you suggest) SAX and STAX are low-level APIs. Coming up > with commonly-accepted streaming subsets of XPath or XQuery might give > the best of both worlds: fast prototyping, as with XSLT or XQuery, > *and* decent performance in a real, production-grade system. Here I am lost again. Why do we need subsetting ? There is a sort of a myth that unless you consider a *subset* of XQuery/XSLT there cannot be good performance/streaming. This myth is strange. Of course, not full XQuery can be executed with zero memory consumption. But please explain: why is this an issue ? Just use full XQuery, and leave the task of minimizing the memory consumption to the XQuery implementors, and if they can execute your queries with no memory consumption, they'll do it; otherwise, they'll just use the minimum amount of memory they need for the given computation. If they'll need to automatically rewrite your query into a equivalent one that enables more streaming, they'll also do it (e.g. rewriting a backwards navigation into forward one). There are hundreds of possible optimizations to enable streaming and increase performance. Why do you think we need XQuery/XSLT subsetting ? Best regards, Dana
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|