|
[XQuery Talk Mailing List Archive Home] [By Date] [By Thread] [By Subject] [By Author] [Recent Entries] [Reply To This Message] Release of the GCX XQuery EngineQStefanie Scherzinger scherzinger at infosys.uni-sb.deSun Feb 4 23:15:30 PST 2007
Hi Frans, thanks for your interest in our XQuery engine, and thanks for your feedback. > What made you think that comparing your implementation to the others was > reasonable? Actually, it is not so easy to get a hold of reference implementations and we had to make do with what is publicly available. GCX has two main characteristics: It's an in-memory XQuery engine and it is geared towards streaming XQuery evaluation. The FluXQuery engine is the most natural choice for a reference, because it is also a main-memory XQuery engine geared towards XML stream processing, and it implements a very similar XQuery fragment. There are other streaming resaearch prototypes (e.g. XSM), but they typically have not been released by their makers yet. The other in-memory engines (QizX, Galax, and Saxon) implement more XQuery features (or all of them), but they are not geared towards stream processing. But at least the principal architecture is comparable. Finally, we chose MonetDB out of pure interest on how we would perform in comparison. As ours is a streaming engine, comparing it against a secondary-storage implementation that can make use of index structures etc. in a different way is unfair to us. Unfortunately, no other streaming XQuery implementations are to be had to be compared against. However, if you know of any suitable implementations, I'd appreciate it very much if you could point us to them. > I'm asking because I see it as comparing apples against oranges. Some of the > other products run the queries as is and they implement XQuery, which is > quite different from not implementing XQuery and rewriting queries to ones > liking. I hope there is no misunderstandment - of course, we first rewrote the queries as shown on the website, and then ran the same queries on the same data on each engine. > Measuiring memory usage with top is as far as I know generally adviced > against. See: > > http://ktown.kde.org/~seli/memory/analysis.html Thanks for the link - however, when GCX needs only a little more than 1 MB main memory for some same query where others require over a hundred MBs, then I think a point has been made. > So the short story to why you get such a low memory foot prints is that you > don't load more of the document than is needed, as told my static > analysis("roles")? There are two key approches: First, document projection where we try to load only what may be needed for query evaluation. This, of course, has been done before (e.g. the Galax people experimented with it, too). Second, and this is new, the garbage collector removes the loaded data once it is not needed anymore. This is done continually, and for many queries it works out very nicely such that only a small subset of the input is kept in main memory at any moment in time during query evaluation. If you are interested in the internals, maybe you'll want to check out the paper on GCX: http://www.infosys.uni-sb.de/publications/INFOSYS-TR-2006-13.pdf Ciao, Steffi
|
Purchase Stylus Studio Online Today!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|






