|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: XML Database Decision Tree?
> -----Original Message----- > From: Nicolas LEHUEN [mailto:nicolas.lehuen@u...] > Sent: Wednesday, October 24, 2001 9:10 AM > To: 'Champion, Mike'; 'xml-dev@l...' > Subject: RE: XML Database Decision Tree? > > > > So I believe there is a whole set of problems that will > benefit from XML databases (which are I believe based on the hierarchical > database model*, maybe Mike can confirm/infirm). The storage, indexation and > querying of a set of document-oriented data is a good example. > > But XML databases isn't or (won't) be a revolution, blasting all other > storage models. We could even say that the XML database model > is just a come back of the hierarchical model that was supposedly > "killed" by the relational model back in the 80s. First, NO ONE that I know of has said that XML DBMS will be a "revolution" that will blast all other storage models. The strongest assertion I know of is that there is a large class of problems that ordinary humans and businesses can address more expediently with XML DBMS than with RDMBS. I think the debate is mostly over what that class of problems is and how big it is. I'm not sure if XML DBMS are "based on the hierarchical database model" in any formal sense, but there is clearly some overlap and inspiration. I distinctly remember a flash of neuronal activity when I first heard of Tamino a few years ago: "Oh yeah, Adabas handles hierarchical data rather than forcing it to be normalized, XML is hierarchical, I bet they have some deep insight on how to build an efficient XML DBMS!" When I joined Software AG some months later, I learned that it's not quite that simple code-wise, but obviously there is much of Adabas' intellectual heritage in Tamino. Your [Nicolas] previous post was in the thread inspired by Fabian Pascal's rants against XML in his column in searchdatabase.techtarget.com and his various comments on the intelligence of several of us in his dbdebunk.com site. I posted a summary of that thread to the "DBA Water Cooler" forum sat earchdatabase.techtarget.com but got no substantive response (other than additional reminders about my lack of intellectual acuity, of course!). I think there's a very interesting question here: Codd *demolished* the CODASYL data model as a respectable intellectual activity, and clearly showed the formal superiority of the relational model. Nevertheless, as the statistics I posted earlier in this thread indicate, hierarchical and other "pre-relational" DBMS keep chugging along, running the world economy in the back offices of the Fortune 1000. Finally, XML DBMS are finding a niche that some analysts (FWIW) are projecting to be in the billions of dollars in a couple of years. If the hierarchical model is dead, why won't it stay quietly buried? One answer is "my code works, my business runs, it ain't broke so I don't care if some professor sneers at it." OK, but that begs the question of why? If the relational model is universally superior, no one has come along and done it better, running the old fogies out of business. I've come up with two reasons, FWIW. One is that humans can intuitively envision and reason about hierarchical relationships more easily than they can perform logical "joins" between diverse facts. Thus, thinking about inter-related bits of information as "documents" (and using the analogy with paper) is just a whole lot easier than doing normalization. Those who disagree and say that people intuitively love tables (witness the explosion of the spreadsheet metaphor in the last 20 years) miss the point that the formal power of the relational model comes not thinking in tables, it's about relations (no duplicate rows!), unique keys, foreign key constraints, joins, and all sorts of other stuff that is utterly foreign to all but the RDBMS cognoscenti. Anyway, I'll assert that the formal superiority of the relational model as an elegant way to model information doesn't overcome our evolved-in bias for thinking hierarchically (see Herbert Simon's "Architecture of Complexity" essay that I keep referencing). When the going gets tough, we stick with what evolution has taught us well rather than using what Codd has tried to beat into our heads. Second, this also applies to computers in that a) programs are written by humans and b) we evolved to think hierarchically because it's a great optimization heuristic that works for computers too. For example, I very briefly experienced the .bomb "revolution" first hand, and had occasion to see the data on book, CD, software, etc. catalogs that a company called Muze supplies to the online stores. It is BEAUTIFULLY normalized; the raw data for something that would be one page in Amazon.com is shipped in about 20-25 tables, everything done completely by the book. I innocently asked, "So, Amazon is doing a 20-way join on a multi-terabyte database everytime I bring up a page? Wow, I didn't know that RDBMS technology had come that far." The Boss (who had served time in the Real World) basically replied "MO-RON! They probably denormalize it to hell, and pre-build views that represent the way people actually look at the data." [For the record, I have NO IDEA how Amazon works so well, I'm just relaying folklore]. In any event, I get the strong impression that real world RDBMS-based applications are doing lots of essentially hierarchical data processing; XML DBMS simply treat this as an advertised feature rather than an optimization secret. So, in principle the relational model presents the user with a clean, logical view of arbitrary complexity, and delegates the details of mapping that clean view to ugly reality to the programmers at Oracle/IBM/Microsoft and the DBAs at Amazon/etc. There are plenty of cases -- especially where information is used in different views by multiple applications, or where no one can tell in advance what relationships will be of interest to end users -- where this works very well. There are plenty of other cases -- especially for "documents" where the whole point of the authoring process was to pre-define interesting relationships -- where it is horribly inefficient. And there are all sorts of cases in the middle that neither RDBMS or XML DBMS know how to handle well, and for which we need better conceptual and software tools than we have today.
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








