Re: Storing Lots of Fiddly Bits (was Re: What is XML for?)
On Thu, Apr 08, 1999 at 12:17:24PM -0500, Paul Prescod wrote: > Marcelo Cantos wrote: > > > > ... [best of both worlds] ... You get a nice object oriented > > layer on top to talk to, and an industrial strength, robust > > repository underneath. > > > > Your comments give me the impression that this is unacceptable to > > you in the XML/heirarchical universe. You don't want DOM at any > > level. You insist on going straight to objects. It is not even > > good enough to build an object layer on top of the DOM layer. I > > find this a little implausible and hence am certain that you had > > something else in mind. Is it rather that you simply don't care > > what the underlying API is, that you are only interested in what > > happens at the object level? > > If I had evidence that a bottom-level XML/"DOM" layer would "buy me" > an industrial strength, robust repository then I would go for it. As > you have pointed out, I can cover up the ugliness with objects. But > to me, an industrial strength, robust repository implies > sophisticated tree-smart *and* link-smart ad hoc query support. The > DOM isn't a query language and doesn't (AFAIK) have a query > interface. It might be okay as an API to the results of a query but > even there I'm leery... I agree with all this. If you're dealing with objects, go with OODB. I think, however, that the situation is far less clear when we are dealing with pure data structures as opposed to first-class objects with behaviour. When it comes to maintaining and querying a large database of _data_ (not objects), I believe a text retrieval engine will generally outperform an object database and often by several orders of magnitude (witness Eliot Kimber's anecdotal post). If scalability and performance are an issue (and, judging by recent discussions, they often are) then text retrieval technology becomes much more attractive. Object databases excel in the area of expressiveness which enables them to support much more complex queries than we can. At present, our product (SIM) doesn't support ad hoc queries. It is more like a relational database in that you define fields, which can be physical fields or calculated fields (this means we support arbitrarily complex structure, but have to decide in advance which set of queries to support, a compromise that has kept our customers happy so far). We are, however, looking at full structure queries in the near future. So while the IR community is closing the gap in the area of expressiveness, I wonder if the Object community can catch up in the area of performance (or maybe it's already there and I just don't know it). > Since trees can be built as a special case of links, I tend to look > for such a beast to come out of the OO world (where links are > usually primary) instead of the text processing world (where the > tree is usually primary). Maybe you guys at rmit.edu can surprise > me though. We certainly hope so. Our customers constantly praise the performance of SIM. However, we definitely see a strong need to beef our product up in the standards area. We are looking into support for XQL and DOM (we have the framework to incorporate both without too much effort. In fact DOM is almost in since it is quite similar to our existing model. XQL is somewhat more effort, but the path indexing required to support multi-gigabyte queries would require little effort--the hard part is query evaluation and, more importantly, optimisation). > But note that a DOM-on-the-bottom is the opposite of the > architecture that I am speaking out against. I'm concerned about > people who want to layer the DOM on "top" of things that do not look > substantially like XML. In that case you are covering up an > optimized, purpose-built abstaction with a homogenized "dumb tree" > layer. That's a step backwards. Note that even the DOM creators do > not view an XML-DOM as a "universal tree API." That's why there are > several variants of the DOM -- for XML, HTML, CSS etc. I must conclude from this that we have little to disagree about in terms of the uses for DOM. I had misunderstood you to mean that DOM is _never_ appropriate for the bottom layer, and I, coming from the document repository universe, would have disagreed. Having said that, however, we tend to view DOM more as a box ticking exercise, since it doesn't really give SIM anything it doesn't already have, albeit in a non-standard way. My views on Object databases are ambivalent. Their highly expressive nature seems unfortunately coupled with poor performance. However, my opinion may be skewed by the very few attempts I've personally seen at piggy-backing a text retrieval engine on an Object database (or, for that matter, on a relational). Cheers, Marcelo Cantos -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format