|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: RE: Penance for misspent attributes
5/17/02 3:28:33 AM, Sean McGrath <sean.mcgrath@p...> wrote: >There is more to it than a buffer. Parsers can and do emit chunks of content >at boundaries that suit themselves. So > ><foo> >Hello world ></foo> > >is not guaranteed to produce 1 data event that can be slurped into a buffer in >one go. More generally, in the presence of mixed content there will definitely >be multiple chunks. So you end up with this pattern: > >start_foo: > buffer = "" > inFoo = 1 > >end_foo: > print buffer > >characters (chunk): > if inFoo: > buffer.append (chunk) > >This rapidly gets out of hand. > >Rightly, the need for this pattern drives the data-heads nuts. It would be >soo nice to >know that in the presence of data-oriented XML, the fundamental parser >layer would >emit complete PCDATA chunks. > >Trouble is, there is no consensus on what data-oriented XML is and how >it could be flagged to a processor. Consequently, data-oriented APIs that >avoid that above unside-down and state-space-laden constructs >such as RAX (http://www.xml.com/pub/a/2000/04/26/rax) cannot go >anywhere. > >An XML Features Manifest would be one way to flag it >(http://www.lists.ic.ac.uk/hypermail/xml-dev/xml-dev-Dec-1999/0002.html) >but that never went anywhere either:-) An even simpler alternative is a SAX filter that does nothing but condense consecutive PCDATA events. Such a thing exists, for example, in the Perl world as XML::Filter::BufferText. That way, you don't have to flag anything to the processor, you just read its output through a pair of appropriately-tinted glasses.
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








