[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: [Question] How to do incremental parsing?

  • From: Guy Murphy <guy-murphy@e...>
  • To: Tony.Coates@r..., xml-dev@l...
  • Date: Wed, 04 Jul 2001 11:58:46 +0100

persistent dom
Hiya.

Or the third alternative is don't use XML to actualy work with your data.
Sometimes XML is the perfect fit to work with, sometimes its not.

XML is very good for data interchange, but often for large dataset
developers can bend themselves out of shape to accomodate XML, when simply
resorting to a traditional RDBMS would be a lot quick. Even in such
applications XML still has a pivitol roll for reading into, and writing out
of the aplpication (serialising data), but a big question mark has to be
drawn over the DOM with regard to its suitability as an interface for large
datasets.

The peristent DOM is an interesting idea, and one I've thought about using
in the past (actualy thought of implementing a lighteright mini-DOM over
MySQL), but I'm not convinced it's got much of a long term future... it
might do, and I could well be wrong, I have no strong assertions to make,
simply that oen might hope that XML Querying will make XML (aware at least)
Databases attractive.

So reflecting Tony's advice, use SAX, and if SAX isn't cutting it, then
maybe consider reading into a RDMS and using SQL =)

Oh and anybody with decent experience of persistent DOM, I'd appreciate the
feedback.

Cheers
    Guy.

----- Original Message -----
From: <Tony.Coates@r...>
To: <xml-dev@l...>
Sent: Wednesday, July 04, 2001 11:21 AM
Subject: Re: [Question] How to do incremental parsing?


>
> On 04/07/2001 01:27:28 "Xu, Mousheng  (SEA)" wrote:
>
> >A problem of all the current XML parsers is that they at least read the
> >whole XML document into the input stream, which can consume a lot of
memory
> >when the XML is big (e.g. 1 GB).
>
> You will generally be told "use SAX not DOM" for large files/streams.
That's OK if your application can deal with the data in your XML in a
localised fashion.  And, it has to be said, designing your XML formats to
work within the constraints of SAX can be a good exercise in avoiding
structures that require backtracking through the document when they are proc
essed.
>
> Still, it often is necessary to backtrack, or make connections between
parts of a document that may be widely separated in the file/stream.  In
this case, you want to be able to use something more like DOM, because the
SAX alternative here would require you to build a store of the information
that has been parsed, and that means (a) writing more code than you might
like to, and (b) possibly storing as much information as a DOM tree would
anyway.  What does seem to be a useful way forward for these kinds of
problems are persistent DOMs built into databases, such as have been
appearing recently.  The DOM tree is then paged into memory as required.  Of
course, this is slower than holding the whole DOM tree in memory, but the
fact is that databases are fast enough to do real stuff with, and if that is
true for relational tables, it should hold true for persistent DOMs.
>
> So, "use SAX or a persistent DOM" for large XML files/streams is what I
would suggest.
>
>      Cheers,
>           Tony.
[SNIP]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.