Re: Does SAX make sense?
Jimmy, zhengyu wrote: >I have got a weird question in mind that I would like to toss it out. > >Suppose there is a way to offer DOM type interface with SAX kind of >efficiency. > Matthew O'Donnnell and I have made a series of presentations on this particular issue. Our latest proposal is known as JITTs (Just-In-Time-Trees), and you can find presentations/papers at: the JITTs homepage, http://www.jitts.org or you can visit our homepage on overlapping markup at: http://www.sbl-site2.org/Overlap/. The basic idea is that markup (and hence trees) are recognized as part of processing of a file and has no meaning for a parser until it has been told to recognize that particular markup token. What would be required is to change the order of processing used by most (if not all XML parsers) to processing the DTD/Schema first and using the resulting tree as the basis for recognition of markup events by SAX. (The SAX module then only recognizing markup tokens in the tree.) The only problem with that approach that has been suggested to us involves directly nested elements, such as <div>blah, blah<div>blah, blah</div>blah, blah</div>, but the incidence of such markup is unknown. The advantage to our approach is that a DomLite tree could be constructed that retains the unrecognized markup (unlike a SAX filter) and upon retreival of the container (recognized markup), the previously unrecognized markup could be processed for presentation to the user. Simulated tests of this type of processing indicates substantial gains in processing speed over traditional construction of full DOM trees. Another advantage is that it operates with standard XML syntax, unlike some proposals, such as LMNL, which has its own (non-XML) format. >How long would it take for the new processing model to become really >popular? > > Well, it has not become popular (yet!) but the rise of partial parsing XML parsers and the like indicate that the need for something more efficient than current processing models for XML. JITTs has been criticized because it makes well-formedness a question that is answered at the time of processing. Personally, I don't find well-formedness apart from recognition at the time of processing by a parser all that compelling (or even meaningful). There are substantial advantages to meeting the requirements of well-formedness as part of processing. I think the first successful JITTs parser that can be applied to large documents, the usual posts to this list, "I have a 10 MB document and need to build a DOM tree...," will force a change in the current "markup recognition first, useful document processing later" approach. The whole point of markup was to enable the processing of documents, not to create artificial limitations to prevent it. Patrick >Jimmy >----- Original Message ----- >From: "Karl Waclawek" <karl@w...> >To: <xml-dev@l...> >Sent: Sunday, May 25, 2003 7:00 PM >Subject: Re: Does SAX make sense? > > > > >>>There are several implementations, but I don't know of any standard >>>interface. I have been thinking that having a standard interface just >>>for passing XPath expressions to an event parser would be great. Anyone >>>know of a standard being worked, implementations, or interested in >>>starting a working group? If so, I'm in. >>> >>> >>I am working on something similar, but much simpler right now. >>My XPaths are just straight paths, or in other words, element types. >> >>My initial plan was to build a DTD (or other schema) validator >>(on top of SAX) which has callback hooks for custom validation >>or processing. The callbacks are registered by the application >>based on a path - but rather a path based on the schema object >>model and not the document object model. Every node in the SOM >>corresponds to a separate set of callbacks. >> >>So far I was not thinking of anything more complex, as I think >>this would be quite an effort. >> >>Karl >> >>----------------------------------------------------------------- >>The xml-dev list is sponsored by XML.org <http://www.xml.org>, an >>initiative of OASIS <http://www.oasis-open.org> >> >>The list archives are at http://lists.xml.org/archives/xml-dev/ >> >>To subscribe or unsubscribe from this list use the subscription >>manager: <http://lists.xml.org/ob/adm.pl> >> >> >> > > >----------------------------------------------------------------- >The xml-dev list is sponsored by XML.org <http://www.xml.org>, an >initiative of OASIS <http://www.oasis-open.org> > >The list archives are at http://lists.xml.org/archives/xml-dev/ > >To subscribe or unsubscribe from this list use the subscription >manager: <http://lists.xml.org/ob/adm.pl> > > > -- Patrick Durusau Director of Research and Development Society of Biblical Literature Patrick.Durusau@s... Co-Editor, ISO 13250, Topic Maps -- Reference Model
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format