[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] XML: logical and/or physical model?
I had a somewhat different reaction than most people to the logical models thread (and Date's keynote speech http://searchdatabase.techtarget.com/originalContent/ 0,289142,sid13_gci962948,00.html that inspired it): One major selling point of the relational model is to separate the logical model of data from the physical implementation of a DBMS. Date alludes to this in his rant against the frequent OODBMS / XMLDBMS analogy about disassembling a car when you come home at night and reassembling it in the morning. 'Anyone who uses that analogy, Date said, displays a "lack of understanding of the difference between the logical and physical model." The use of the terms "flat tables" or "2D tables" to describe data stored in a relational database is wrong, he added.' As I understand it, it is the job of the RDBMS implementation to perform whatever mapping from the logical to physical world is needed to do this efficiently. [1] Perhaps one advantage of XML is that it just blows off this distinction -- it gets a lot of its practical power by 'modeling' relationships as *physical* containment of a set of elements (which of course may be subtrees) inside other elements. As a logical model, this suffers from all the limitations that Codd exposed in the 1970's, but as a pragmatic way of handling text and data that tends to be ordered and hierarchical, it has a lot going for it: - It is generally going to be easier to implement efficiently in read-only or dataflow/pipeline processing applications where referential integrity is not an issue because the XML document itself defines the relevant context. - It scales / parallelizes well if all the information needed to perform some business process is carried around in a discrete chunk that only requires access to a transactional DBMS at the beginning and end of a business process. (Likewise, it relatively easily supports optimistic or compensation-based transaction processing). - It maps fairly directly to business-level documents (e.g. orders, invoices, etc. that can be directly represented as XML documents but generally normalize to a significant number of tables), thus facilitating communication between the developers and users of software. (In the best case, the "business" view of the document *is* the XML with a stylesheet applied). - It greatly reduces the need for DBAs, part of whose job is to maintain the logical-physical mapping. In my not-so-objective opinion, XML's success mirrors the ongoing success of post-relational DBMS such as Adabas that adapt the storage model to the physical data structure of the application rather than asking the application to adapt to the logical model of the DBMS. [Yes, I assert that Adabas, invented in 1969, is a POST-relational DBMS -- very visionary! ]. Clearly there are downsides of this (as the relational proponents have pointed out for decades), but there are also distinct advantages in terms of performance, robustness, etc. in a lot of situations where the relational model's intrinsic advantages are not relevant. So, my question is whether this characterization of XML as an essentially physical model rather than a logical one makes sense? Of course, the Infoset and XQuery treat XML as a logical model that is independent of the "physical" serialization or DBMS implementation, so what I'm talking about is more of a design pattern for using XML than an intrinsic property of XML itself, and is independent of whether one thinks of XML as a labeled tree data structure or Unicode text with angle brackets. At any rate, the relational model doesn't need to be defended from XML, it lives in a different plane of reality. Date and others really seem to be making a *political* objection: people have stopped putting pressure on the DBMS vendors to support the pure relational model in a way that is efficient, reliable, and easy to use for messy real-world data. Customers and RDBMS vendors have started using XML to address that set of problems that it handles relatively easily but are still at the bleeding edge of relational technology, e.g. where order and hierarchy are critical and relationships are easily modeled via containment. [1] Ken North mentioned D.L. Childs STDS work that influenced the relational model. Childs is a neighbor of mine, and has a rather interesting metaphor for this: The logical model is up in the world of sunshine and light where the Eloi dwell with little concern for ugly realities; the physical model is the one that lives down in the land of the Morlocks who do the dirty work. It's nice to be oblivious to physical reality, but this sometimes leads to a really unpleasant realization when you find out where you really sit in the food chain :-) BTW I have a bunch of Childs recent stuff archived at http://xsp.xegesis.org if anyone is interested in seeing where the STDS thinking has gone in the 35-or so years since Codd cited it; I am intrigued because he proposes a way to formally unite the relational model and XML's implicit data model using an extended set theory that makes order and hierarchy first class citizens. In STDS (which I used via an early RDBMS called Micro at the University of Michigan back when dinosaurs roamed the earth), there is a formal relationship between the logical and physical models, which allows query optimization to be driven down almost to the hardware level.
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|