[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: XPath over DOM

Subject: Re: XPath over DOM
From: Uche Ogbuji <uche.ogbuji@xxxxxxxxxxxxxxx>
Date: Fri, 16 Feb 2001 08:27:31 -0700
xpath python 101
> >  In abstract
> > terms, it's a single tree traversal, with one assignment per node.
> 
> That's one way of doing it, but it means you have to have somewhere to put
> the sequence number. You can't put it in the DOM objects themselves, so you
> have to create a static wrapper, which involves creating more objects:
> hopefully not one per node, otherwise you might as well rebuild the tree.
> The approach Xalan an xt use (I believe) is to do the document order
> comparison dynamically, by finding the lowest common ancestor of two nodes.

Ah.  I see.  I guess this is the advantage of Python's dictionaries 
(associative arrays).  It makes such external decoration quite trivial, and 
quite efficient.  Perl folks gain the same advantage, and C folks can do so 
with a simple hash table.

> > > (b) skipping over and counting nodes correctly in the
> > presence of things
> > > such as entity reference nodes, CDATA nodes, and
> > unnormalized text nodes,
> > > and
> >
> > There is a normalize() if the user doesn't mind mutation.
> 
> Mutation of the supplied tree, I think, is out of the question. (This also
> makes whitespace stripping much more difficult - another thing I forgot to
> mention.)

Then you'd have to wrap with internal indices.  More complex.

> Incidentally, MSXML3 gets this wrong: using CDATA gives you multiple
> adjacent text nodes. I think that's evidence that it's not easy: and they
> have the advantage that they only work with their own DOM implementation.

No one says it's easy.  It it takes time and experimentation.

> > The rest, at least
> > as I've attacked it, is a matter of wrapping, again in the
> > same pass as doc-order indexing.
> 
> I'm thinking of doing it (eventually) in Saxon by dynamic wrapping using
> flyweight objects, in the same way as the Saxon "tinytree" currently works.
> >
> > > (c) dealing with the multitude of ways that the DOM allows namespace
> > > nodes to be (or not be) represented.
> >
> > ???  Do you mean Level 1 vs. Level 2?
> 
> That's part of the issue. Element and attribute names in the DOM can contain
> a namespace URI, the namespace URI may or may not be present in an xmlns:xxx
> Attr node. The set of namespace nodes, as far as I can see, is the union of
> namespaces that are used in element and attribute nodes plus namespaces that
> are declared in xmlns:xxx pseudo-attributes, in the current element or in
> any ancestor.

Oh.  We sort this out on our scan pass.  The algorithm is pretty simple, 
actually.

But most of what we do takes advantage of dictionaries, which helps a lot.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@xxxxxxxxxxxxxxx               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python



 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.