[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Function for determining one XPath as subset of a
On Tue, 2016-01-26 at 16:15 +0000, Adam Retter adam.retter@xxxxxxxxxxxxxx wrote: > Given two simple XPaths, say: > > 1. //w > > 2. /x/y/z/w[@a = 'v'] >B [...] > I wondered if there might be an algorithm or library that someone > already had or has written which might be able to give me the answer? There's quite a bit of literature on optimizing XPath in the context of databases (e.g. in proc. VLDB, in "XML-Based Data Management and Multimedia Engineering" and elsewhere) but I don't know of a single coherent summary. This may be at least in part because what's useful and practical depends on the implementation and the situation. For example, a system with an element occurrence index might solve //w in O(n_w) time, where n_w is the number of occurrences of w elements in the corpus being searched; one using a tree structure might do better with the explicit path. A path like //a//b//c//d is much harder using tree navigation, but rewritten as d[ancestor::c[ancestor::b[ancestor::a]]] against an element index will likely run in time close to O(n_d log D) where D is the average depth of the tree. A system smart enough to notice that there are only three "c" elements in the corpus could do better than O(n_d) though (my text retrieval system did/does that to speed up multi-word phrase searches, for example). We encountered each of these implementation strategies during the development of XQuery. B The cost of building an element index for a non-database-backed XPath implementation is in some systems amortized by doing it during parsing, while the input XML is being read; systems that run XPath against a stored or in-memory tree might not have that luxury, or might have the even greater luxury of a precomputed index. I don't know of anything off the shelf, and although open source XQuery implementations might be a place to look, it's likely that optimizations are done on an internal representation of the XPath expression. > > I realise that I can only probably cover a subset of XPath itself, > but > it is only the path steps with predicates which I am interested in. > > Ideally I am looking for something in Java.
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|