[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Simple approaches to XML implementation
>>class XMLParser { >>... >>parser(XMLEventHandler handler); >>... >>} > >That's one way of doing things. The main problem I see with this interface >is that there are quite a few possible methods (I count 71 classdefs in >the SGML property set, though of course not all of those are applicable to >XML), and it becomes difficult to expand the set of events. I use about 8 event handlers for most of my API's... >As much as possible, a good reusable component should not force the >user's hand when choosing what node to grab onto. As an example, >YACC is pretty bad about this. You supply it with a lexer (with a >fixed name) and a set of handlers to be called when productions are >reduced. The YACC-generated parser insists on being in charge. Sure. The important thing with is that if you want to query into a document, you have to have parsed at least as far as the nodes you want to access, and that haveing a tree representation for such cases makes it a *lot* easier. For cases where you "want to be in control", I would have the event handler be a grove constructor, and have the application work upon the grove. Note that accessing a grove, or querying a document is *different* to *parsing* a document. >1. An external entity manager, responsible for obtaining document > instances (the "start" document and others), DTD's, etc. from > local storage, the web, some database, etc. This should probably > be user-customizable. I'm not sure about this. In some ways, I cannot see the reason for *exposing* an entity manager, but then again, I can imagine an implementation without one either.... >2. An encoding manager, responsible for mapping one of the possible > XML document encodings (Latin-n, UTF-7, UTF-8, UCS-2, UTF-16, whatever) > onto ISO10646 characters. Streams... >3. The parser itself, responsible for turning characters into XML events, > and possibly into grove structures. Push grove building off to later stages. >[Browser] gives the most complicated parser, since it has to asynchronously >handle information from several different documents. > >[YACC] is the easiest to write, but it's less flexible. Given [Browser], >it's easy to write [YACC]. (Given [XMLEventStream] you can also derive >[YACC], but with greater overhead.) > >[XMLEventStream] and [Grove] give you the most flexibility with respect to >the grove plan. I think these confluge many different processing layers. >languages, but the only firm conclusion I've come to is that I really wish >I could use coroutines. Amen to that sentiment. xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@i... the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|