Re: Function for determining one XPath as subset of a

Play the video

Subject: Re: Function for determining one XPath as subset of another
From: "Liam R. E. Quin liam@xxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Tue, 26 Jan 2016 21:35:25 -0000

On Tue, 2016-01-26 at 16:15 +0000, Adam Retter
adam.retter@xxxxxxxxxxxxxx wrote:
> Given two simple XPaths, say:
> 
> 1. //w
> 
> 2. /x/y/z/w[@a = 'v']
>B [...]
> I wondered if there might be an algorithm or library that someone
> already had or has written which might be able to give me the answer?

There's quite a bit of literature on optimizing XPath in the context of
databases (e.g. in proc. VLDB, in "XML-Based Data Management and
Multimedia Engineering" and elsewhere) but I don't know of a single
coherent summary.

This may be at least in part because what's useful and practical
depends on the implementation and the situation.

For example, a system with an element occurrence index might solve //w
in O(n_w) time, where n_w is the number of occurrences of w elements in
the corpus being searched; one using a tree structure might do better
with the explicit path.

A path like //a//b//c//d is much harder using tree navigation, but
rewritten as d[ancestor::c[ancestor::b[ancestor::a]]] against an
element index will likely run in time close to O(n_d log D) where D is
the average depth of the tree. A system smart enough to notice that
there are only three "c" elements in the corpus could do better than
O(n_d) though (my text retrieval system did/does that to speed up
multi-word phrase searches, for example).

We encountered each of these implementation strategies during the
development of XQuery. B The cost of building an element index for a
non-database-backed XPath implementation is in some systems amortized
by doing it during parsing, while the input XML is being read; systems
that run XPath against a stored or in-memory tree might not have that
luxury, or might have the even greater luxury of a precomputed index.

I don't know of anything off the shelf, and although open source XQuery
implementations might be a place to look, it's likely that
optimizations are done on an internal representation of the XPath
expression.

> 
> I realise that I can only probably cover a subset of XPath itself,
> but
> it is only the path steps with predicates which I am interested in.
> 
> Ideally I am looking for something in Java.

Current Thread

Re: Function for determining one XPath as subset of another, (continued)
- Dimitre Novatchev dnovatchev@xxxxxxxxx - 26 Jan 2016 17:19:13 -0000
- Michael Kay mike@xxxxxxxxxxxx - 26 Jan 2016 19:24:07 -0000
  - Wolfgang Laun wolfgang.laun@xxxxxxxxx - 26 Jan 2016 19:29:31 -0000
  - Christopher R. Maden crism@xxxxxxxxx - 26 Jan 2016 20:37:48 -0000
- Liam R. E. Quin liam@xxxxxx - 26 Jan 2016 21:35:26 -0000 <=

<- Previous	Index	Next ->
Re: Function for determining , Christopher R. Maden	Thread
Re: Function for determining , Christopher R. Maden	Date
	Month

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Subscribe in XML format

RSS 2.0
Atom 0.3

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >