RE: Nasty XPath expressions
> > I wonder what XPath expressions would cause an "XPath > processor" to sweat? > > Any "XPath processor" developer out there? > > Howdy, I'm Bob, one of the developers of Jaxen (http://jaxen.org/), a > 'universal' XPath engine for Java. > > The biggest issue we've had (it's Today's Big Issue) is regarding > document-ordering when using a union expression. > > $foo/bar | $cheese/melty > > XPath says that nodes should be returned in 'document order', which > becomes a non-trivial in the case of some expressions involving > the union operator. That depends on your data structures, of course. Saxon's native tree structures (both of them) are optimised for this operation. The standard tree structure stores a serial number in each node, the "tinytree" (which is now the default) stores nodes in an array, in document order. These structures both exploit the fact that the tree is immutable, and both make sorting into document order trivial. Saxon also has a driver allowing access to JDOM trees. With this data structure, sorting into document order is indeed painful: though if you optimize for common cases, such as all the nodes being siblings, it's not too bad. Saxon of course goes to great lengths to avoid the need for a sort when it knows the nodes are already in document order, as they will often be, for example with path expressions such as chapter/section/@title; and a union is done as a merge operation on sorted operands. And a point of detail, which I think Evan Lenz already commented on: XPath 1.0 doesn't say the nodes must be sorted in document order, and there are many cases where sorting is unnecessary. For example, many operations only require selection of the node that is first in document order, which can be found without doing a sort. Mike Kay
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format