|
[XQuery Talk Mailing List Archive Home] [By Date] [By Thread] [By Subject] [By Author] [Recent Entries] [Reply To This Message] get highest numberWolfgang wolfgang at exist-db.orgSat May 17 19:52:43 PDT 2008
>> Performance in a database is very dependent on how many pages you read from >> disk. If there's an index that contains the node identifiers for all the >> attributes named id, and that index occupies one page, then you can evaluate >> //@id with one page access. But to evaluate max(//@id) you need to >> dereference those node identifiers, which may mean accessing one page for >> each separate id attribute. > > So the index just contains the node identifiers - the actual values > for those nodes still need to be fetched from disk? In our implementation, most path expressions operate on references, not on the actual nodes. The references are just pairs of documentId/nodeId and do not occupy much space in an index. As Michael pointed out, the entire sequence of @id references can be stored in a few disk pages. Also, all possible relationships between nodes can be computed by comparing node ids. This means that expressions like /root//@id are evaluated without loading the actual nodes. All subexpressions just return references to nodes, not the nodes themselves. To serialize or atomize a node, the node references need to be dereferenced, which means loading the node from the sequence of pages in which the document is stored. This may indeed result in one page access operation for every attribute. > If so, if you indexed //@id/text() and then did max(//@id/text()) > would that be a solution? Creating a "value"-index on //@id/text() would at least have the advantage that all required atomized values could be stored in subsequent pages, thus causing less IO. However, in many cases the expression within max() could be rather complex, making it hard to decide on what exactly the index should be created. I was recently thinking about a solution which just pre-fetches the atomized values on demand, thus speeding up the second access, though not the first one. Anyway, those are implementation-specific questions, which may not belong on this list, at least not in detail. Wolfgang
|
Purchase Stylus Studio Online Today!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|






