[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Three Access Language Paradigms
I have been thinking intensely about several issues these past few days, and I've been trying to put them all together into a coherent whole. So far I'm not succeeding, so I'm initiating a series of discussions to help me make sense of things. Here's the first... We would like clients to be able to remotely manage documents residing on servers. Clients need to be able to both query and edit those documents. This might be done via OMG CORBA interfaces, or it might be done via a human-readable query language. Whatever the mechanism, I'd like to call the mechanism a "document access language" or just an "access language" for purposes of this discussion. In this posting I explore three different access language paradigms. It seems to me that so far the W3C has focused on using DOM as the language by which clients remotely access documents. Under DOM, clients view documents through CORBA interfaces that make the document look like a tree of XML objects. Once the W3C has established all of the necessary interfaces, a client will have full control over a document's contents, subject to DTD and access control constraints. More recently, we have discussed possibly supplementing the DOM approach with a human-readable access language. A streamable access expression would be shipped to the server, and the server would provide a streamed response. Document content would have to transfer between client and server, and the form of the content would be constrained by the DTD that defines the document. The syntax of the human-readable language is undecided. It might be OQL or it be SDQL with extensions or it might be XML with embedded content. I'd like to present still another form of access language. This approach is based on a different way of thinking about documents. Instead of asking document repositories to look like XML documents to the external world, we only ask that the repositories speak XML with the external world. DTDs would be defined for the protocols that repositories might care to speak. The DTDs would define the structure of the protocol messages rather than the structures of documents. One repository might speak several protocols (e.g. 'Patient Records Protocol V.152' or 'Bank Transaction Protocol 2A'). If the repository were capable of containing arbitrary XML documents, the repository might speak a specific protocol called 'XML Document Protocol V.1.0'. Under the third approach, XML documents would appear less often as persistent repositories and more often as transient messages between clients and servers. It would still be necessary to define the base DTD for all of these protocols since one server port must be able to parse them all well enough to identify the protocol. It may even be possible to define the syntax for queries, insertions, and updates, so that the individual protocols have less inventing to do. Briefly consider the benefits of the third approach. The most significant benefit is that it completely frees the repository from having to conform to an XML object model. We could expose a legacy database to the world through one of the protocols with only a thin wrapper around the database. New databases could restrict the protocols they support and specialize their structures according to the kind of data they care to represent. They could be based on custom object-oriented schemas or relational schemas. This approach also lowers the entry level into the data repository server world. We could think of servers more as information warehouses than as virtual documents. The most significant drawback of this approach is that it doesn't give us a single access language. It probably gives us a different access language for each protocol. (Somebody please let me know whether this need not be so.) One of those access languages would be defined in the 'XML Document Protocol,' and this is the language that we have been looking for so far. Ideally, the access languages for all of the protocols would have the same syntactic substrate, so that the only new additions to each protocol would be elements that are specific to the information being represented. However, it is not immediately apparent to me that this will be possible. Yet, there are so many ways to represent data in XML and in other formats such as relational and persistent OO. The database vendor should not be constrained to use an architecture that will export the repository as something that looks like XML (such as DOM). For example, many different DTDs can be invented to represent a given set of data, and no standard should constrain a vendor to use a specific DTD for organizing the information. A standard should exist for how to query and update information and for how to represent the data of concern (e.g. patient records or transactions) -- that's what the DTDs should define. Hence, I came to the protocol proposal. Now it's time to talk about SQL and OQL. To a large degree these languages expose the representation underlying the database. SQL exposes tables and columns, while OQL exposes the persistent classes and their methods. These access languages are defined based on the schemas, so that once the schemas are defined, voila, so are the access languages. We save ourselves a lot of time. The SQL and OQL approach has one extremely significant drawback: compatible databases have identical schemas. Where are the clients that speak 'Patient Record Schema V.2.1,' and where are all the databases that are compliant with this schema standard? Everybody uses generic database backends, and no little guys can come in to compete by specializing for a given standard. If we had based these older query languages on protocols, it wouldn't have been much of a problem for object- oriented vendor X to come in and replace relational vendor Y's server implementation of a standard; there would have been no need to replace the clients. Shouldn't we be building that sort of flexibility into our new XML-compliant databases now, so that we will be able to accomodate tomorrow's unexpected architectures? I do not believe that it is necessary for an access language to expose the database's architecture. In our case, I do not believe an access language must assume that the database is architected in a way that allows it to appear externally as an XML document. It might be desirable to do this, since it could keep us from having to extend the query language for each protocol, but I do not think that it is necessary. It is only necessary that the client and the server agree on the structure and the meanings of messages sent between them. We ought not place constraints on our servers that need not be there. I think DTDs for persistent documents are going to be over-constraining. I have more issues to discuss regarding DOM and the required nature of an XML-document query language. Everything seems related to everything else, but I'll end this topic here just to get things started. -- Joe Lapp (Java Apps Developer/Consultant) Unite for Java! - http://www.javalobby.org jlapp@a... xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|