[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: Does SAX make sense?
I totally agree Michael. I would like to see a mixed model which allows XQuery filtering of events and similar behavior. Some layers based on SAX filters seem to come close to this goal. > > Rick Jelliffe wrote: > > > > I am idly wondering whether unpooled steaming Java APIs of XML documents > (e.g. SAX) > > really make as much sense as we might like them to. > > I've been wondering why we have such absolute either-or choices in > available APIs. Why not hybridized APIs that provide event streams, but > let you collect arbitrary spans of content into an object model that can > be more easily manipulated and accessed without needing a complete > in-memory tree model of the entire document? > > I think both SAX and tree APIs are unweildy to work with. I'm more > interested in rule-based and pattern-based approaches, but prefer not to > have to build a complete in-memory model of the entire document to > enable such an approach. > > > > > It strikes me that there are two factors that undermine the benefits of > streaming processing: > > > > * XML documents are rarely smaller than memory > > * Java implementations typically only garbage collect when they get > "near" > > to filling their heaps. > > > > These two things conspire to make it that, for the lion's share of > documents, > > by the time the SAX stream is finished, all the SAX events will be still > > in memory, though perhaps unreachable. If they are in memory, why not > > make them available? > > > > That being the case, it seems that simple streaming such as SAX provides > > don't make sense. They would be better to either > > > > * have the SAX stream kept cached for the lifetime of the document > > (or have some kind of weak reference perhaps) since they are in memory > > anyway (though unreachable), allowing backward-looking XPaths; or > > Pooling objects using weak references incurs a small performance penalty > (I've experimented a bit with such approaches, though not for SAX > events). In the context of a real-world application this penalty is > likely to be pretty minimal. Nonetheless, if someone is using SAX, it > may be becaused they are trying to maximize performance. > > > > > * requiring SAX clients return events to a pool (which would reduce > > memory use). > > > > Does that sound right to anyone? > > The approach I'm experimenting with, right now, in my swan toolkit > (http://swan.sourceforge.net) is maintaining a stack to support > backward-looking XPaths and XSLT pattern-matching, melded with rules > that can gather content into suitable data structures for relevant > portions of a document. As part of that, I have a prefab rule one can > use to gather up a fragment into a minimalistic tree API that supports > XPath queries. This could easily be adapted to use a full-fledged tree > API for the fragment, but I was more interested in using XPath > expressions than navigating unweildy tree APIs. > > This is still all in a rough state. I haven't done a file release of > this code, yet, and some key portions are not in CVS, yet (due to some > problems I've been having with CVS integration with Eclipse). I've also > been letting this languish the last few weeks, but am starting to get > back into it this weekend. I've been approaching this in a rather lazy > fashion (my motivation has been admittedly low), but I hope to have an > alpha release of something soon. > > ----------------------------------------------------------------- > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an > initiative of OASIS <http://www.oasis-open.org> > > The list archives are at http://lists.xml.org/archives/xml-dev/ > > To subscribe or unsubscribe from this list use the subscription > manager: <http://lists.xml.org/ob/adm.pl>
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|