[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: SaxXPathFragmentFilter - Reduse large DOM trees using a SA
<plug type="blatant"> And if you do want to have actual XPath-esque support, I heartily suggest taking a look at SAXPath to do your xpath parsing for you. http://saxpath.org/ </plug> -bob On Wed, 28 Nov 2001, PaulT wrote: > > I really like what you've done, but the language you're > using is not XPath ( neither it is a subset of XPath ) > and I see a problem here ( I think I also have some > kind of solution to that problem and I'l express it > in my next letter ) > > Rgds.Paul. > > ----- Original Message ----- > From: "Niels Peter Strandberg" <nielspeter@n...> > To: <xml-dev@l...> > Sent: Wednesday, November 28, 2001 5:40 AM > Subject: SaxXPathFragmentFilter - Reduse large DOM trees using a > SAX XPath cutter! > > > > I have made an experimental SAX XMLFilter. It allows you to "filter" out > > the information in an xml document that you want to work with - using > > xpath - and skip the rest. You can place the filter anywhere in your > > application where a XMLFilter can be used. > > > > - I don't know if this has already been done by others? > > > > The whole idea is to "filter" out the fragments from the xml document > > that you specifies using an xpath expression. ex. > > SaxXPathFragmentFilter(saxparser, "/cellphone/*/model[@id='1234']", > > "result"). Build a dom tree from the result, or why not feed the sax > > event into a xslt transformer and do some xslt transformations. > > > > The big win is that you don't have to build a large dom tree, if you > > only needs part of the information in a large xml document. You just > > specify what fragments you want using xpath and the result will be a > > much smaller dom tree, witch requires less processing, memory etc. > > > > Let us say that you have a large document with spare parts to Volvo > > vehicles. You want to do a list of engine parts for the S80 car model. > > What you do is specify the xpath (locationpath) that you want to cut out > > from the document ex. "/catalog/cars/s70/parts/engine". > > > > // your sax parser here > > XMLReader parser = > > XMLReaderFactory.createXMLReader( > > "org.apache.xerces.parsers.SAXParser"); > > > > // Get instances of your handlers > > SAXHandler jdomsaxhandler = new SAXHandler(); > > > > String xpath = "/catalog/cars/s70/parts/engine"; > > String rootName = "s70engineparts"; // this will be the new > > root. > > > > // set SaxXPathFragmentFilter > > SaxXPathFragmentFilter xpathfilter = > > new SaxXPathFragmentFilter(parser, xpath, > > resultrootname); > > xpathfilter.setContentHandler(jdomsaxhandler); > > > > // Parse the document > > xpathfilter.parse(uri); > > > > // get the Document > > Document doc = jdomsaxhandler.getDocument(); > > > > > > This SaxXPathFragmentFilter is pure experimental. It is spaghetti code. > > I just sat down with an idea and started to code, and the code is not > > very pretty. It needs to be rewritten. > > > > > > The xpath support is very limited for now. Here is the xpath you can do > > today with this filter: > > "/a/b" - An absolute path. > > "/a/*/c" - An absolute path but where element no 2 "*" could be > > any element. > > "/a/*/c[@att='value']" - If element c has an attribute with 'value'. > > "/a/*/c[contains='value']" - If element c first child node is a > > text node that contains 'value'. > > "/a/*/c[starts-with='value']" - If element c first child node is a > > text node that starts with 'value'. > > "/a/*/c[ends-with='value']" - If element c first child node is a > > text node that ends with 'value'. > > "/a/*/c['value']" - If element c first child node is a text node > > that is 'value'. > > "/a/*/c[is='value']" - As above. > > > > As you can see the xpath options is very limited. But I think that when > > I find a way to implement the "//" pattern, the filter will be even more > > powerful. > > > > I have problems with building a dom tree from the result using xerces > > and saxon. But with jdom it works great. This needs to be fixed. > > > > You can not rely on that the result is allways correct, so don't use > > this in any application, just use if for expermentation. > > > > You can find the code at: > > > http://www.npstrandberg.com/projects/saxxpathfragmentfilter/saxxpathfragment > filter. > > tar.gz > > > > My goal with this filter is to keep it realiable, simple, fast and > > clean. If you want to contribute to this project, then you will be > > wellcome. The filter will be realeased under som kind of opensource > > license (if we get that fare!). > > > > Test it an give me some feedback, on what you think. > > > > > > Regards, Niels Peter Strandberg > > > > > > ----------------------------------------------------------------- > > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an > > initiative of OASIS <http://www.oasis-open.org> > > > > The list archives are at http://lists.xml.org/archives/xml-dev/ > > > > To subscribe or unsubscribe from this list use the subscription > > manager: <http://lists.xml.org/ob/adm.pl> > > > > > ----------------------------------------------------------------- > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an > initiative of OASIS <http://www.oasis-open.org> > > The list archives are at http://lists.xml.org/archives/xml-dev/ > > To subscribe or unsubscribe from this list use the subscription > manager: <http://lists.xml.org/ob/adm.pl> >
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|