[XQuery Talk Mailing List Archive Home] [By Date] [By Thread] [By Subject] [By Author] [Recent Entries] [Reply To This Message]

TLC XQuery timings and XMark size factors

Wolfgang Hoschek wolfgang.hoschek at mac.com
Wed Jan 17 11:09:41 PST 2007


  TLC XQuery timings and XMark size factors
On Jan 17, 2007, at 10:26 AM, James A. Robinson wrote:

>
> [This may be considered off topic, my apologies if it is. It's related
> to XQuery by way of a paper and the XMark test, but since it's not a
> "How do you do Y in XQuery" I'm unsure.]
>
> Hi folks,
>
> I came across an interesting looking paper last night,
>
>   "Tree Logical Classes for Efficent Evaluation of XQuery"
>   http://www.eecs.umich.edu/db/timber/files/tlc.pdf
>   (165.43 KB)
>
> The math is over my head, but I was curious about the results they
> write about regarding their algorithm when applied to XMark data sets.
> I'd not looked at XMark until now, though I've read about it on  
> some blogs
> (Dr. Kay sometimes writes about his tests of Saxon against XMark  
> data).
>
> Downloading the xmlgen program from http://monetdb.cwi.nl/xml/, I'm  
> a bit
> confused about the numbers listed in the paper, and I was wondering if
> someone who has used xmlgen could explain something to me:  The  
> authors
> say they tested 'size factors from 0.1 (approx. 67MB combined data  
> plus
> indexes space) up to factor 5 (3.5GB combined data plus indexes  
> space),
> and I'm wondering if anyone who has read (or cares to read) that paper
> can tell me if they understand how those sizes were reached?
>
> The sizes I'm seeing from xmlgen don't seem to map to the same  
> sizes the
> authors list.  A size factor of 0.1 comes out to just under 12MB of  
> data.
> Looking at http://monetdb.cwi.nl/xml/faq.txt, I was simply running
>
>   xmlgen -f 0.1 -o xmark-0.1.xml
>
> Adding pretty formating only adds another couple of megabytes to  
> the size.
> I'm curious to try and generate similar sets of data to see if I can
> run tests against a couple of platforms available to me, but this  
> first
> examination makes me wonder if there is something missing from the
> equation which I don't know about.
>

12 MB sounds about right. Here are the file sizes and scale factors  
I'm getting

-rw-r--r--    1 hoschek  hoschek    1161615 Nov 17  2005  
auction-0.01.xml
-rw-r--r--    1 hoschek  hoschek   11669705 Nov 17  2005 auction-0.1.xml
-rw-r--r--    1 hoschek  hoschek   58005732 Nov 17  2005 auction-0.5.xml
-rw-r--r--    1 hoschek  hoschek  116517075 Nov 22  2005 auction-1.0.xml

"approx. 67MB combined data plus indexes space" might indicate the  
storage consumed when storing the data in some kind of indexed XML  
database with alternative storage format, rather than in a plain XML  
file.

Wolfgang.


Purchase Stylus Studio Online Today!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2007 All Rights Reserved.