[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: XML Database Decision Tree?
"Champion, Mike" wrote: > First, perhaps I misunderstood, or perhaps we disagree on > what "structured data" means. You (Ron) imply that it means > something like "easily normalizeable," in which Kevin Williams' > assertion is pretty much a tautology. Agreed. Hence my point that the native XML database vendors already largely agree to most of the points he's making. > I assumed it meant > "well-understood, predictable structure", irrespective of > how easy it would be to model for RDBMS storage. A bill of > materials is "structured data" as far as I understood the > term, but is not a poster child for the the "RDBMS everywhere" > campaign. Also agreed. Lots structured data fits RDBMSs better than bills of material, so it's a no-brainer to use RDBMSs for that data just like it's a no-brainer to use native XML databases for documents. It's the bills-of-material and similar structured data cases that make the line between where to use RDBMSs and native XML databases fuzzier than one might initially think. > Anyway, the DocBook sneer was a cheap shot ... I > admit it! ... but an article that starts "This column takes a > look at so-called native XML databases" by a guy whose 1000 > page book called "Professional XML Databases" never (AFAIK from the index) > even mentions the possibility of a "native" XML DBMS doesn't exactly > inspire open-mindedness on my part <grin>. Fair enough. My initial reply to DocBook was to say, "He's still wrong, but the reply is 'semi-structured data', not DocBook," until I noticed he stated at the close of the article that semi-structured data wasn't being considered either. It's too bad that the writing in some cases added an unnecessary bias to an article that otherwise made some good points. > > Could you explain this further? The flexibility is easy to understand, > > but I'm having trouble seeing what it buys me. If I store my data 17 > > different ways, I'm going to have one heck of a time querying it. [explanation of why native XML databases are flexible snipped] > So, the example of a collection of XML instances whose schema that has > evolved through 17 different iterations would not be a problem at all in > some of the native XML DBMSs, would not be a problem in Tamino in most > cases, and in the typical case all 17 subsets would be stored together in > the "miscellaneous" collection where they can be queried with reasonable, if > not optimal, efficiency. The only time you'd have to query 17 different ways > is if the top-level tag changed with every iteration of the schema; so this > is a worst case scenario involving a completely clueless set of developers > and DBAs. This is the part where I said you'd have to beat me over the head a bit :) If you keep changing the schema in really nasty ways, such as changing element type names, combining or splitting existing elements, or inserting new elements into the middle of the hierarchy, no amount of flexibility is going to make your queries any easier. They will have to cope with a multitude of incompatible changes. I think that that means there are only two allowable changes: 1) Adding new elements and attributes, with elements added to the end of existing content models 2) Deleting optional elements and attributes. (Is this correct? Or are there more changes that you can make in a way that doesn't break existing queries?) Since these are equivalent to adding/deleting nullable columns, what is the technical advantage afforded by the native XML database? I just don't see it. > My larger point (as a couple of people have picked up on) was that most DB > developers do not control their own destiny with respect to schemas -- they > store the data someone tells them to store, and if the schema changes, they > deal with it. Native XML DBMS developers will spend a lot fewer nights and > weekends "dealing with it" than those trying to store evolving XML schema in > an RDBMS. I'm also having a hard time buying this argument. All you're saying is that there is no bureaucracy (yet) in place to control the contents of the native XML database. This is similar to what happened with spreadsheets when they first came out -- people got control of their own data and the DBAs lost out. If this represents a true political change on the part of corporations, fair and good. Native XML databases can take credit as the agents of change, although it's not clear that their technology had as much to do with it as their newness made it possible to slip in under the radar. But what I'm not convinced of is that, should native XML databases succeed, the bureaucracy controlling what goes into the database won't try to take them over. Granted, the schema-less nature of native XML databases will make that job harder, but it won't stop them from trying. > I don't think this lack of flexibility in the logical-physical mapping is > "disastrous." That wasn't the disastrous I was referring to. The "disastrous" I meant was when one group starts storing data in a way that it is very difficult for another group to use. Somebody has to mediate this process, whether it's DBAs, developers, or your Aunt Sarah's Ouija board. Failing to do so will cause problems down the road. Doing so will reduce the flexibility of native XML databases, although for political reasons, not technical reasons. > <marketing hat="on">Tamino is evolving to be more than an XML DBMS engine, > it is more like what IDC calls a "virtual database server" that can present > "native" XML representations/manipulations of diverse data in an XML store, > RDBMSs, transactional databases such as Adabas, middleware interfaces to > enterprise applications, etc. So I have no axe to grind here about the > intrinsic value of one physical storage mechanism vs another.</marketing> I think this is a very good point. Native XML databases look like a good way to integrate data from a variety of backends, and I think the winning native XML databases of the future will do this transparently and bi-directionally. (Some are already doing it today.) Relational databases may have problems in this area, since the result of integrating data from a variety of sources is likely to be semi-structured data, not structured data. -- Ron
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|