[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Penance for misspent attributes
SAX is great for generic XML handling - it's easy to hook up a handler for building a document representation using DOM or some other model, for instance. It's very awkward for direct processing by an application, though, and I think autogenerating state machines just add another layer of complexity. Pull parsers seem a better approach for this type of application. Using a pull parser gets you away from all the problems of event-driven state machine programming and lets you process the document structure directly. You can see my JavaWorld comparison at http://www.javaworld.com/javaworld/jw-03-2002/jw-0329-xmljava2.html for some discussion and code examples on this topic. The only real problem with using pull parsers right now is limited availability. The XMLPull site at http://www.xmlpull.org has details of the common interface implemented by two pull parsers currently (with hopefully more to come), so it's a big step in the right direction. There's also a JSR in progress (JSR-173) to develop a Java standard API for pull parsers. - Dennis Bill de hÓra wrote: > >-----BEGIN PGP SIGNED MESSAGE----- >Hash: SHA1 > > >>-----Original Message----- >>From: Sean McGrath [mailto:sean.mcgrath@p...] >> >>There is more to it than a buffer. Parsers can and do emit >>chunks of content at boundaries that suit themselves. So >> >><foo> >>Hello world >></foo> >> >>is not guaranteed to produce 1 data event that can be slurped >>into a buffer in one go. More generally, in the presence of >>mixed content there will definitely be multiple chunks. So >>you end up with this pattern: >> >>start_foo: >> buffer = "" >> inFoo = 1 >> >>end_foo: >> print buffer >> >>characters (chunk): >> if inFoo: >> buffer.append (chunk) >> >>This rapidly gets out of hand. >> > >Yes it does. However we can start to accept we're hacking a state >machine and encapsulate the conditional reasoning: > >start_foo: > enterState(start_foo) > >end_foo: > getHandler().execute() > leaveState(start_foo) > >characters (chunk): > getHandler().accept(chunk) > >this can be data driven and very fast; it works much like a simple >dispatching server or the lookup tables common enough in game >programming. Granted we've been here before about how developers >find state machines awkward but it does leave open the possibility >of being declared and then autogeneratated. Was this approach never >taken with SGML? There doesn't seem to be a lot work being done in >the public domain to codegen saxhandlers (maybe I'm looking in the >wrong places), but I expect it will become common enough. I'm >pretty sure people are using Maps and the like to key event >handlers, but I haven't seen it in the wild. > >Bill de hÓra > > >-----BEGIN PGP SIGNATURE----- >Version: PGP 7.0.4 > >iQA/AwUBPOT1euaWiFwg2CH4EQKSpACfQmqGmuyyAOOY62QwC837Nr6QzYcAniSL >TmYoU6Bw1SzOptFaH1ebwiiR >=m9Fb >-----END PGP SIGNATURE----- > > >----------------------------------------------------------------- >The xml-dev list is sponsored by XML.org <http://www.xml.org>, an >initiative of OASIS <http://www.oasis-open.org> > >The list archives are at http://lists.xml.org/archives/xml-dev/ > >To subscribe or unsubscribe from this list use the subscription >manager: <http://lists.xml.org/ob/adm.pl> >
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|