[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: Storing Lots of Fiddly Bits (was Re: What is XML for?)
At 12:55 PM 1/30/99 -0500, Borden, Jonathan wrote: > In general, object databases have been designed to efficiently store lots >of c++ (or java) objects which contain embedded pointers (or references) and >they provide a mechanism to navigate the database using the semantics of a >pointer dereference. They are not designed to *efficiently* perform complex >queries, especially those that SQL databases excell at. If this is the definition of object database, then I don't think it qualifies as a "database" at all--it's just persistent object storage, which is useful, but not very interesting. At least my layman's idea of a "database" is that it is both general and supports queries. Of course, this has always been one of my problems with object-oriented programing in general: it tends to cause people to conflate the data with the processing to the degree that the objects end up becoming primary, rather than things that serve the data. Persistent objects are useful as an optimization technique but they should never be a substitute for standards-based data repositories. As a Certified SGML Paranoid Nutcase (CSPN) I distrust all software implicitly and therefore always prefer solutions in which the data, represented using SGML or XML, is the primary data store, with any other representations being merely transient reflections of that data for purposes of optimization and that sometimes you are forced to trust your software not to screw up your data too badly. Of course I realize that this extreme view can't work for a some use scenarios, but it turns out to work really well for a lot of them, especially high-volume *publishing* scenarios, where the input to the publishing system is the SGML or XML--the cost of reserializing documents stored as objects at production time is orders of magnitude higher than the cost of objectizing them at indexing or editing time, largely because the throughput requirements are different for these different processes. In other words, if the SGML data wasn't the primary format, it would be impossible to meet the production throughput requirements. For one particular customer, even the cost of not having the files directly on the file system is too high, so they have to go around behind the back of their storage manager (which provides access control and file-level versioning). Or said another way: optimizing for one part of the process usually, if not always, deoptimizes for another part. Not news, but it bears repeating once in a while. As an example of the cost of deserialization, we have a client with about 80 Meg of SGML data organized into about 15000 small documents (most documents are less than 2K in length). On a 400mhz Pentium II with 128Meg of memory (running Windows NT) and gigs of free disk space, it takes 21 hours to load this data into the repository (one of the leading SGML element manager databases, implemented on top of a leading object database) and 8-10 hours to export it. And, unless we're doing something wrong, the import process does not include indexing of the data, only objectizing it. This seems a little extreme to me. It may be that this product is particularly poorly implemented or that we have failed to perform some essential tuning action, but still, 21 hours? I hope that this annecdotal evidence is not indicative of other, similar systems, but it's not very encouraging. Cheers, E. -- <Address HyTime=bibloc> W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 www.isogen.com </Address> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|