[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] XPath filtering SAX was Re: SAX2: marrying SAX and DOM
Ken MacLeod wrote: > * A parser can [should?] maintain a partial DOM tree, at least > parents, that would allow other XML functions to be used. For > example, using XPath to perform matching. I have been doing that, trying to implement 'higher-level' event dispatching from a SAX event stream to a listener which defines what data it is interested in in the form of XPath expressions. The API goes roughly as follows (simplified for illustration) - public interface XPathListener { public abstract void handleData(Node[] nodes); } public interface XPathFilter { public abstract void addListener(XPathListener l, String match); public abstract void process(Parser p,InputSource i); } Client code which wants to retrieve some data from an XML stream registers a node set expression identifying 'data of interest', and only this data will be returned. Assume the following XML data (partial document) <stream> <data type="int">1</data> <data type="str">x</data> <data type="str">y</data> .../... One would register interest in the value of 'data' elements with 'str' types using the following code : XPathFilter xpf; xpf.addListener(this,"data[@type='str']/text()"); xpf.process(somesaxparser,someinputsource); the above data would result in two handleData() calls being made, once for each text node of a data element with type 'str'. This is much cleaner than the alternative - keeping track of state information in an object's startElement()/characters()/endElement() methods - especially if the element tree is deeper than a couple levels. Naturally, not all XPath features 'work' over SAX - e.g. following-* axes or position() calls, depending on how much of the DOM tree you are willing to build as you go along. I'm fairly sure though that with suitable restrictions this would be a worthwhile addition to the XML developer's arsenal, because XPath expressions are a concise way of identifying only the parts of an XML data stream that your program is interested in - without hand-coding specific automata every time. If you are parsing whole documents, an XPath matcher on top of the DOM will do fine - but this does not work if you need to parse an incoming XML-formatted data stream and process data as it becomes available, and the class of application I'm working on (real-time chat using an XML-formatted protocol) requires that. I have a quick-and-dirty, proof-of-concept implementation which works, kind of - the 'best' way of delivering data-of-interest to client code is not obvious (whether to use a DOM-compliant Node class or something more lightweight, whether to use arrays or more complex collections), and the XPath expression parser is unbelievably crude - mostly because in current XPath implementations the parsing code cannot be easily separated from code that relies on the DOM. If anyone is working on something similar, or has suggestions on API or implementation, I'm interested in your comments. ======================================== Laurent Bossavit - Ingénieur R&D >>> laurent@m... <<< >> ICQ#39281367 << MultiMania http://www.multimania.fr/ ======================================== *************************************************************************** This is xml-dev, the mailing list for XML developers. To unsubscribe, mailto:majordomo@x...&BODY=unsubscribe%20xml-dev List archives are available at http://xml.org/archives/xml-dev/ ***************************************************************************
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|