RE: [Question] How to do incremental parsing?

From: Jeff Lowery <jlowery@s...>
To: "'Xu, Mousheng (SEA)'" <Mousheng.Xu@s...>,"'xml-dev@l...'" <xml-dev@l...>
Date: Wed, 04 Jul 2001 10:33:24 -0700

Play the video

For large documents where:
	a) the target data is sparse (bits scattered throughout the large
document), and
	b) the location of the data is known (what branch it hangs off the
tree),
then you might get the best performance, speed- and memory-wise, using a
pull-based parser like kXML (www.enhydra.org). With a pull-based parser, you
lightly skip over the nodes you're not interested in until you find a node
that has content you're looking for. 

Caveats: 
1) I don't know if anyone has actually done performance testing to verify
the above claim, and 
2) kXML, at least, has some limitations, quote:
- kXML does not support user defined (external) entities. 
- The doctype declaration is not parsed. However, a corresponding "legacy
event" is generated by the    parser, so application programmers are able to
parse the doctype declaration themself 

> -----Original Message-----
> From: Xu, Mousheng (SEA) [mailto:Mousheng.Xu@s...]
> Sent: Tuesday, July 03, 2001 5:27 PM
> To: 'xml-dev@l...'
> Subject: [Question] How to do incremental parsing?
> 
> 
> Dear all,
> 
> A problem of all the current XML parsers is that they at 
> least read the
> whole XML document into the input stream, which can consume a 
> lot of memory
> when the XML is big (e.g. 1 GB).
> 
> One way to get around the problem would be to read the XML 
> file into memory
> gradually and when needed. I would like to build such a DOM 
> parser, but I am
> not familiar with the design of the Xerces XML parsers. Could 
> someone give
> me a suggestion on how to tackle on the problem? The most 
> critical part
> would be the method to parse an element. If reading the whole 
> document into
> memory is inevitable, then I would like to borrow the method 
> which parse the
> input stream to get the next element.
> 
> Your help is highly appreciated.
> 
> Thanks in advance.
> 
> -- Mousheng Xu 
> 
> 
> The information contained in this email is intended for the
> personal and confidential use of the addressee only. It may
> also be privileged information. If you are not the intended
> recipient then you are hereby notified that you have received
> this document in error and that any review, distribution or
> copying of this document is strictly prohibited. If you have
> received  this communication in error, please notify Celltech
> Group immediately on:
> 
> +44 (0)1753 534655, or email 'is@c...'
> 
> Celltech Group plc
> 216 Bath Road, Slough, SL1 4EN, Berkshire, UK
> 
> Registered Office as above. Registered in England No. 2159282
> 
> ------------------------------------------------------------------
> The xml-dev list is sponsored by XML.org, an initiative of OASIS
> <http://www.oasis-open.org>
> 
> The list archives are at http://lists.xml.org/archives/xml-dev/
> 
> To unsubscribe from this elist send a message with the single word
> "unsubscribe" in the body to: xml-dev-request@l...
>

Prev by Date: RE: [Question] How to do incremental parsing?
Next by Date: RE: reference of a remote doc
Previous by thread: RE: [Question] How to do incremental parsing?
Next by thread: RE: XML Linking 1.0 and XML Base become W3C Recommendations
Index(es):
- Date
- Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Subscribe in XML format

RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >