|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: Streaming XML (WAS: More on taming SAX (was Re: [xm
As someone who was until very recently "one of those implementers" I completely disagree with you. We had customers who want to process XML documents that hundreds of megabytes to gigabytes in size who can't afford to materialize even a fraction of these documents in certain cases. Then there were customers who wanted to process thousands of XML documents per minute and couldn't afford to overhead of object creation/memory consumption/GC. Using XQuery or XSLT in such scenarios even with various optimization tricks just wouldn't cut it. Every paper I've seen on streaming XML assumes some forward only processing OR is just wrong. Instead of telling folks to use Google Scholar or CiteSeer to find relevant works are there any techniques in any papers in particular you want to highlight. -- PITHY WORDS OF WISDOM Eat right, Exercise, Die anyway. ________________________________ From: Daniela Florescu [mailto:dflorescu@m...] Sent: Mon 12/27/2004 12:00 PM To: XML Developers List Subject: Re: Streaming XML (WAS: More on taming SAX (was Re: ANN: Amara XML Toolkit 0.9.0)) > I've thought about using an XPath tracker in error reporting to > my library, which would be very simple to add at this point, and > it's necessary, I think because the document locator loses > meaning when I chain together a bunch of SAX filters. .......... > > In any case, I'm reading through some of the other articles > you've been posting. This is a very interesting discussion. I read with great interest the whole discussion about XML streaming and SAX, and I have to admit that I am very confused by it. Could you guys please try to clarify for me the answer to the following question: instead hand coding steaming applications using SAX, couldn't you write some XQuery code (with external functions probably) to do the same thing ? Did you try at least ? Did you try and fail ? If yes, why did it fail ? My hope is that at a certain point people will stop writing low level code, and they'll rely on good implementations of XQuery to do the right amount of streaming, in the optimal way. That should be vendor's problem, not user's problem. Other question: why do you people care about "perfect" streaming, i.e. streaming with zero memory consumption ? Between perfect streaming and total materialization there is a world of possibilities in between, where materialization happens, but only restricted to the minimum amount of data required to compute the answer, and only for the minimum amount of time necessary to compute the correct answer. Perfect streaming happens too rarely to be of any interest. What is interesting is all this world in between. Anyway, I believe that people shouldn't try to hand code their applications using low level APIs like SAX or STAX, but use a higher level language like XSLT or XQuery, and trust the XQuery/XSLT implementors that they'll do a good job to minimize memory consumption. That's *their* job, not *yours* as users. But anyway, for those interested in streaming processing XML, the database research might come in handy. There have been several studies of the problem in the literature. For example you could find some of it at http://citeseer.ist.psu.edu/ searching for "streaming XML"; starting from there you might find some interesting papers. Best regards, happy holidays, Dana ----------------------------------------------------------------- The xml-dev list is sponsored by XML.org <http://www.xml.org>, an initiative of OASIS <http://www.oasis-open.org> The list archives are at http://lists.xml.org/archives/xml-dev/ To subscribe or unsubscribe from this list use the subscription manager: <http://www.oasis-open.org/mlmanage/index.php>
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








