Some thoughts on 'direct access' to XML (long)
In the following, the term "BOXED XML" is used to describe the set of XML technologies exposed to the average programmer by their tools and vendors of those tools. I.e. XML 1.x, namespaces, XSLT, W3C XML Schema (and soon) XQuery. Suggestions: 1. The *only* thing that flows across a wire or between two apps should be unicode-with-pointy-brackets. If that is all that is flowing - and if BOXED XML does not pollute this, then everyone can be happy. If, in any given application, the unicode-with-point-brackets is serialized objects, fine! If it is process independent semantic business data, fine. If is some application specific WordProcessorML, fine. Nobody is forcing anybody to use anything other than unicode-with-angle-brackets. 2. Object serializations are a very useful and common technique in programming. Object serialization can be accommodated on top of unicode-with-angle-brackets without discommoding anyone AS LONG AS BOXED XML does not shove the object-serialization world-view down everyone's throats. Those of us who do not wish to treat XML exclusively as a serialization notation for objects are concerned that adding strong data typing, yada, yada into BOXED XML basically facilitates programmers in thinking of XML as serialization technology. It isn't, wasn't and shouldn't be allowed to become a serialization technology. 3. Those of us who are against xml-as-serialization-notation-for-process-and-platform-specific-objects tend to worry most about the interoperability and coupling implications of so doing. Empirical evidence would suggest that you have to be "of a certain age" to think that this is vitally important. (It is). 4. What is the easiest way to divide a stock price by revenue minus expenses? Obviously its something like this: Stock = LoadStockFromXML("stock.xml") return Stock.price / (Stock.revenues - stock.expenses) Any attempt at doing that in DOM/SAX/XSLT/XQUery is always going to come a poor second compared to the native language expression of the algorithm. This is a clear case where instantiating a Stock object from the XML is *exactly* the right thing to do. BUT this does not mean that BOXED XML must provide object serialization/de-serialization. If you take the shortest route to object serialization, you will just serialize/de-serialize your objects using whatever XML persistence tools your environment provides. If you do that you have - perhaps without realizing it - made your XML significantly less useful. Your XML has become process specific. Change your object structure (because of requirements changes or bugs) and all your serializations instantly turn into legacy. Interop with other systems is no better that it would have been with Java serialized objects, Python pickles, marshalled CORBA objects etc. 5. Of course it is a good idea that programmers should have tools to make it easy to read/write XML from their programming language of choice. However, IT DOES NOT FOLLOW, that BOXED XML must be bent out of shape to provide strong datatyping, binary encoding and all the other stuff that typically accompanies notations for object serialisations and transmission. Those of use who do not see XML as purely an object serialization technology worry that this sort of addition runs the risk of tilting the XML technology stack, away from process independent unicode-with-angle-brackets. The object world view is so IN YOUR FACE these days that the warning signs are there for all to see. 6. The right way to serialise/de-serialize objects is on top of, not inside BOXED XML. And guess what guys? Its not rocket science. I do it every day of the week. Do I need BOXED XML to natively support that object-world view to do so? No. Do I want BOXED XML to natively support that world-view? No! No because I appreciate that the effort I take to divorce serialization/de-serialization from my unicode-with-angle-brackets is well worth it. It repays the effort over and over and over again as my systems evolve. I can evolve them without breaking them by applying standard Web techniques of proxying and interventionist intermediaries. They are a joy to monitor and debug. I would not sacrifice my unicode-on-the-wire for anything! 7. Programming languages can and should move past SAX/DOM for accessing XML. For pure document processing, they both have their place but for Objects and Records (as the terms are used in mainstream programming), they are sorely lacking. I believe it is entirely possible to make the programmers life easy WITHOUT turning BOXED XML into basket of object-serialization technologies. This needs to be done ON TOP OF BOXED XML. Otherwise, I fear that a complete split in the XML world is inevitable. Don Box recently said that if we insist on looking at Web Services as an RPC technology we will have missed a glorious opportunity. No argument there. I would add that if we insist on viewing XML as an object seralization technology, we will also have lost a glorious opportunity. Especially since that two are intimately related. Sean http://seanmcgrath.blogspot.com
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format