[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: A standard approach to glueing together reusableXML fragme
<Quote> Unless someone can show me how XML or an XML only tool set such as TeraText supports and fulfills RM, </Quote> Are you asserting that one cannot represent relationally structured data using XML? If so, can you please elaborate? Kind Regards, Joe Chiusano Booz | Allen | Hamilton dbexcom wrote: > > At 05:44 PM 8/20/2003 +1000, you wrote: > > >On Tue, Aug 19, 2003 at 04:48:08PM -0400, dbexlist wrote: > > > I like what I see in TeraText, from their web site, but none of the > > > situations of which I am aware can afford to treat the data elements, or > > > XML data items, as text only. Every one of these applications has cause to > > > use relations between normal forms of the data elements, and to do > > advanced > > > indexing on various data types not just text, such as dates and date > > > ranges, numerical process results (averages, means, distributions, etc), > > > scientific enumerations and so on. > > > >Just to clarify, as one of the TeraText developers I should note that the > >TeraText DBS can store and index data not just as SGML or XML or MARC data, > >but also as both primitive types such as dates, durations, integers, floats, > >booleans, and Unicode/ASCII strings. These can be repeating, combined in > >user-definable, recursive structures, or can used to populate dynamically > >calculated fields. So it's not just raw XML. :-) > > news to me, but good to hear. > > > > Gov't docs are often like that - they are heavily laden with text or > > prose, > > > but also have significant valuations in other data types including math > > > equations with all sorts of notation formats or other readings such as > > > pollution indexes from the EPA, or farm crop estimates vs. harvests by > > crop > > > by month by county by year, or rainfall vs. temperature over time for each > > > day by gps coordinate areas, etc. etc. > > > >Yes, absolutely. It's really common for applications to want to directly > >store lists of keywords, dates, durations, etc. in a record, along with > >well-formed or valid XML. > > > > > In other words, the TeraText approach does not seem to support relations > > > between normal forms, and so seems to have a self imposed design limit > > that > > > I, personally, find short of desirable. It is not just about massive data > > > handling, but also about being able to do things with that data after it > > > has been captured and has existed for some time, things that support > > > requirements that are not yet known. In my opinion. Only normal forms and > > > relational theory or the relational model (RM) offer this capability, > > in my > > > opinion. > > > >Yes; building chains from one piece of information to another can > >be invaluable, particularly with intelligence problems. To that end, > >the TeraText DBS has the ability to index specific relationships between > >records in different databases; a bit like pre-computed joins. > >For particular kinds of applications, this is often precisely what's > >needed. True, it's not the same as having a relational database, but > >if one has several 100GB of genuinely relational data one can always > >attempt to manage it with [a leading RDBMS]. :-) > > The situation presented to me was that a high growth (10% / yr or more) > very large datastore (terabytes of prose plus terabytes of data, plus > streaming media) data store is _best_ implemented in pure XML or an XML > only struture, even though the processes using this data require relations > on normal forms, self-joins, inner-joins, outer-joins, full corpus searches > of some complexity and versioning of documents. My response was that, maybe > it could be done, but XML only was not the best way to quickly achieve low > cost (both initial and maintenance / operational) and high reliability and > high flexibility in off the shelf hardware (sun servers at most). It does > not seem to me that this size and scope of data can be managed in anything > other than [a leading RDBMS], though perhaps it can be built in TeraText or > another similar product line. > > The key word here being "managed". Massive data stores like this take on a > life of their own in my experience, gain their own momentum and dynamics > with an ever increasing list of dependent systems or processes. This makes > them difficult to manage. I just don't see the tool set in TeraText that I > see in, say, Oracle. > > For the sake of discussion I am willing to stipulate that TeraText, or [a > leading XML only vendor] can do everything Oracle can do, though my > experience is that this is emphatically _not_ the case.. There are still > serious concerns with an XML only approach. Specifically, my gut feeling is > that a pure XML approach has a significant risk, or a certainty, of > n-modifications being driven by y-permutations of z changes across static > schemas and into XML docs (whether record oriented or data oriented). > Meaning it seems to me that XML maintenance work will grow exponentially > over time, while [a leading RDBMS] maintenance work remains linear or less > than linear with respect to the baseline level of effort. > > It worries me to see PTO and other efforts proceeding without apparent > consideration to the specific, well documented, and very difficult to > resolve issues that drove the development of Relational Theory and the > Relational Model (RM), way back when.... I agree with the position taken by > others that if SQL adhered to and fully supported RM that SQL maintenance > issues would be exponentially less than they are currently and have the > same sentiments towards XML .... IE if it fully supports RM then we can > reasonably expect lower maintenance and support costs over time, if it does > not support RM then we can reasonably expect escalating maintenance and > support costs over time. Exponentially escalating costs are highly > undesirable in my opinion. > > Unless someone can show me how XML or an XML only tool set such as TeraText > supports and fulfills RM, my expectations regarding exponentially > increasing maintenance work efforts will remain a serious concern for me. > The issues that drove the development of RM have not gone away, and are > very apparent to me in many, or all, of the XML discussions I read - though > different language is used. > > One does not have to look far to see a plethora of examples, in business or > the public sector, of high maintenance costs associated with > state-of-the-art XML systems. Lots of data exists from published sources to > support the concern that high maintenance costs are escalating at a > non-linear rate for the vast majority of XML systems even though most of > these systems are not XML only solutions. > > Theory and practice always differ, but I would like to see proofs that high > maintenance costs, escalating over time, is not the normal evolutionary > path for almost all XML systems. > > It is not the normal practice for budgets to allocate funds exceeding the > original application cost, year after year, escalating over time, for > maintenance work on existing applications, in my opinion. Nor is this the > result expected by senior or high level management. > > In practice, in real world practical applications, well designed dbms > systems that approach RM require at most 1/100th of their original > development costs in maintenance expenditures on an annual basis. If an XML > approach cannot offer a better result at a lower cost over the lifetime of > the application, then I submit that the only ethically and morally valid > approach (that is to say the only Professional approach) in the context of > private sector economics or public sector economics is [ a leading RDBMS > vendor ] product. > > Regards, > > Larry > > >Regards, > >Michael > >____________________________________________ > >http://www.mds.rmit.edu.au/~msf/ > >Multimedia Databases Group, RMIT, Australia. > > ----------------------------------------------------------------- > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an > initiative of OASIS <http://www.oasis-open.org> > > The list archives are at http://lists.xml.org/archives/xml-dev/ > > To subscribe or unsubscribe from this list use the subscription > manager: <http://lists.xml.org/ob/adm.pl> begin:vcard n:Chiusano;Joseph tel;work:(703) 902-6923 x-mozilla-html:FALSE url:www.bah.com org:Booz | Allen | Hamilton;IT Digital Strategies Team adr:;;8283 Greensboro Drive;McLean;VA;22012; version:2.1 email;internet:chiusano_joseph@b... title:Senior Consultant fn:Joseph M. Chiusano end:vcard
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|