[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: XML tools and big documents

  • From: "Ingo Macherius" <macherius@d...>
  • To: xml-dev@i...
  • Date: Thu, 3 Sep 1998 02:46:49 +0200

ups xml post
David Megginson <david@m...> wrote at 1 Sep 98, 16:57:
> I do not need to build a tree for the whole document; instead [...] 
> dump it into my SQL database [...]. 
=> Put it into an RDBMS

> [...] it makes more sense to build the specialised object tree
> directly from the event stream rather than building a DOM tree
=> Put it into an OODBMS

"Michael Kay" <M.H.Kay@e...> wrote at Wed, 2 Sep 1998 10:31:41 +0100:
> [...] storing the Java serialization of DOM-like models on disk [...]
> takes a lot longer than reparsing original XML
=> Put it in a file and reparse

So when it gets big, use a database ? Did I get this wrong and XML 
was never ment to be a storage paradigm ? 

Anyway, I can affirm Michael's results.
We implemented an experimental database storage for SGML with jjc's 
SP and Informix's IUS. It generalizes something similar to David's 
second suggestion. Object-aggregation is done by marking the content 
of specified element types (e.g. <act> in a Shakespeare play) to be 
stored unparsed. When it comes to queries it is reparsed on the fly. 
Kind of automatic object generation.
Queries turned out to become slow when granularity gets less coarse. 
Most navigations trigger child/sibling lookups, which trigger object 
ID table lookups. That's at least one SQL statement firing for every 
DOM navigation call. Caching helps, but doesn't really the problem. 
Trees in RDBM are no fun. Michael writes they are no fun in OODB, 
too. IMHO the good timings in in-memory DOM implementations result 
from the fact that looking up children is a cheap operation. In 
current DB systems it's not cheap at all.

Is anybody aware of literature for efficient addressing in trees ? 
This should help both in-memory DOMs and DBs.

A bit disillusioned,
	++im

--
Ingo Macherius//Dolivostrasse 15//D-64293 Darmstadt//+49-6151-869-882
GMD-IPSI German National Research Center for Information Technology
mailto:macherius@g... http://www.darmstadt.gmd.de/~inim/
Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Zappa)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@i... the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@i... the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@i...)


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.