[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] XML CMM ISO9000 compliance? - was A standard approach to glueing togethe
You have identified one area of concern, or set of factors / issues, that concerns me regarding use of XML for data systems, and I agree with your view as to how difficult such are to implement, maintain and support. But it seems to me that your conclusion that "...but I have to assume that (pulling numbers out of the air) a 3-way Join of hierarchical document collections will be more practical than 100-way joins across normalized relations containing the components of complex documents such as aircraft maintenance manuals...." causes me some concerns. Specifically: - it works the other way, IE a 3 way outer join on normalized data is more effective than 1 join for every element in every hierarchical document, where you might have several elements in hundreds or thousands of document files. - assuming the 3 way join on XML docs is the same question as the 100-way join (and how can that be the case if the relational data is well designed, or could be mapped to the XML docs?), the 100-way join can use optimization facilities existant in database products such as Oracle that do not exist for XML docs - components of complex documents exhibit increasing complexity over time, IE it is not a static system, but rather is a dynamic system. So while the 100 way join will always be a 100 way join, the 3 way join is highly likely to become a 300,000 way join over time, or an exponential growth in complexity over time for the non-normalized non-relational forms. The fatal assumption seems to me to be inherent in the perception of a document as a printed page, a static physical object that does not change. Once it is automated, as a relational data system or an XML document, this assumption no longer holds true. Notations are added, links are added to other documents, external references link into, or through, specific areas or context references in the document and so on and so on. While XML, as a child entitiy of SGML, might be well suited to static document markups, I just cannot see how it is well suited to dynamic document automation. Next, I expect to hear folks say that it is not _meant_ or _intended_ to be well suited to dynamic document automation, to which my reply is Oh Contraire..... if you automate with XML that is precisely the premise you are utilizing.... that XML is a best practice approach to dynamic document automation. Unlike a printed page, an automated document, like any other automated data system, is dynamic and subject to change driven by external requirements that are by definition in flux. Assuming that a static state anywhere in the automated document process is acceptable is not valid IMHO. Sure, you might be able to make it work today. Or even tomorrow. But working for 20 years, or longer, is not likely to be viable because the maintenance and additional work requirements are likely to change in as yet unknown ways, driving costs that can be shown to be at least linear and more likely exponentially increasing over time. That kind of outcome is precisely what TQM and then PE (process engineering) and now ISO 9000 and CMM have tried to avoid. That kind of outcome is not uncommon among software or automation projects, historically, and, sadly, at present. That kind of outcome, a chaotic result, is typical of development processes that do not employ scientific methods, or use proofs and hard tests where results are measurable, reproducible, and predictable. Now, of course the exception occurs now and then, someone will reach into a haystack of needles and pluck out precisely the needle needed, but that is always within a limited scope, or known universe, and is much more likely when the requirements are less rigorous and the lifecycle is shorter than the norm. So, ok, that's my take on it. Problems arise most often from the assumptions we do not realize we are making, or have not examined in proper course. Ergo the gauzy or foggy feeling one gets from CMM, the point is to identify problems before they are problems, and cure them long before they exhibit negative effects. Thanks for your response. At 08:30 AM 8/20/2003 -0700, Mike Champion wrote: >--- Rick Marshall <rjm@z...> wrote: > > > <customer> > > <name>COMPANY X</name> > > <town>SOMEWHERE</town> > > <order> > > <part>ABC123</part> > > <quantity>2</quantity> > > </order> > > <order> > > <part>ABC234</part> > > <quantity>4</quantity> > > </order> > > </customer> > > > > just isn't going to be a relational form as there's > > no way to determine > > a priori what the normalised records are.... > > > so without some semantics you can't represent > > relational tables with the > > natural tree structure of xml. > >Yup. The hierarchical approach that XML supports >allows you to not worry about the sometimes >challenging problem of figuring out what the keys >would be in a normalization that will allow you to get >back the information you put in. It's sortof like the >fox and hedgehog: the relational model has a many >tricks for defining relationships among components, >but you have to be clever to use it well; XML has only >one trick ("containment") but it's a pretty powerful >one. Of course, not all data fit the "natural tree >structure of XML" but a lot of interesting examples >do. > >The downside, which I think is the point of this >thread (I haven't read the whole thing!) is that XML's >"one big trick" works best if the document as a whole >is the unit of analysis and storage. Once you start >composing compound documents out of individual >entities or need to update specific >elements/attributes inside an entity, things start to >get very ugly and there's little in the way of a >theoretical model such as Codd developed to guide you. >For example, there is a more or less irresolveable >muddle between the XML syntax level model of entity >declarations and references and the >Infoset/XPath/XQuery model in which these are assumed >to have been resolved. (DOM tries to play on both >sides of the street, but that part of its conceptual >model is very ugly). > >XQuery is probably a great breakthrough here by >allowing both the implicit containment relationships >that the relational model lacks and allowing documents >to be composed by a Join operation on shared values, >which AFAIK is the most profoundly powerful aspect of >the RM. Whether XQuery implementations can be written >in a way so as to make this practical for >terabyte-scale databases is yet to be seen ... but I >have to assume that (pulling numbers out of the air) a >3-way Join of hierarchical document collections will >be more practical than 100-way joins across normalized >relations containing the components of complex >documents such as aircraft maintenance manuals. > > >----------------------------------------------------------------- >The xml-dev list is sponsored by XML.org <http://www.xml.org>, an >initiative of OASIS <http://www.oasis-open.org> > >The list archives are at http://lists.xml.org/archives/xml-dev/ > >To subscribe or unsubscribe from this list use the subscription >manager: <http://lists.xml.org/ob/adm.pl>
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|