[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Handling very large instance docs
> >At the very least I need to be able to sequentially process a large > >document and extract an identified sub-tree (ideally denoted by an > >XPath expression) for run-of-the-mill tools to manipulate. I assume > >such a beast would need to be based on a SAX parser. > > I did exactly that in Python. I considered building an engine that > could filter SAX events to those that match a limited version of > XPath, but ran out of gas. I ended up with a just regular SAX > application. Interesting - I always thought such a thing is useful, but haven't come across implementation. I built something like that in Delphi (I call it SAXPath) on top of SAX. First you define an array of records (structs) each with a name (or wildcard) - like XPath - and a call-back interface pointer (used for filtering/predicates or processing). I call the array elements "path nodes", and the array "path handler". Then you register such an array with a "handler manager" for processing. Only relative paths are currently supported. Call-backs are done on every node of such a "path handler" as long as it matches and as long as filter-call-backs further up haven't de-activated the "path handler". For the projects I am involved in this has proven very practical. Karl
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|