|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] PSVI formalization
Recent discussions here about XQuery, XPath 2.0, and their knotted relationships with W3C XML Schema have made me think a fair amount about the relationship between XML and W3C XML Schema, particularly the Post-Schema Validation Infoset (PSVI), more deeply. There were a bunch of presentations last year about how XML + XSD -> XML 2.0, something I found merely annoying then but which makes more sense now. The community that craves these features is poorly served in many ways by XML 1.0, with its text orientation, structures that can be loose to the edge of complete unpredictability, and a human-readability requirement that is incredibly verbose but useful in many cases only for debugging stages. XML 1.0 is now more and more buried under layers of other processing, and the common foundation for W3C work moving forward appears to be the PSVI - or at least an enormous amount of effort is going into integrating the PSVI with a large number of projects, and it seems that most of the vendor and programmer excitement these days is focused on the PSVI, not the brutish markup that lurks underneath. The PSVI seems to be what programmers and database folks want. It offers strongly typed and highly structured information, already guaranteed to conform to their expectations. It has the same flexible named hierarchies that XML offers, with none of the messy concerns about character encodings, CDATA sections, or the limitations of text for storing binary information. At the same time, the PSVI is pretty difficult to express in XML. Layers of type information can make it complex to pin down how best to describe a particular piece of information. Object-oriented development manages that every day, but doesn't have to express the whole hierarchy for every piece of information in a flat representation. Given recent discussions of synthetic PSVIs, it's not always clear that XML+schema->PSVI. I'm concluding from all of that that XML is not a good foundation for the kinds of information developers want from the PSVI, and that retrofitting XML to carry that information is perhaps the root cause of the complexity explosion we're seeing in W3C XML Schema and specifications which build on it. It seems to me that it might be wiser to use the PSVI directly for more abstract information modeling rather than expecting XML representations to carry the load. So where does this take us? Developers who want to work with the PSVI should work with the PSVI, and not worry about XML. The kind of interoperability the PSVI is designed to provide is very different from the kind of interoperability that XML provides - a perfectly reasonable conclusion given the different situations leading to the creation of their respective specifications. Beyond that, it seems like some easily-exchanged representation of the PSVI is in order. XML works, sort of, but it seems pretty obvious that there are better approaches to representing information if you have all the information the PSVI provides rather than a simple "all is text" approach. This could easily be a binary format, though text might also be an option. XML has done a wonderful job of convincing the world that it is possible to agree on base formats for some kinds of information, and that generic tools (parsers, editors, etc.) can be useful for a wide variety of specific problems. It seems reasonable to suggest that the lesson of XML is not "everyone must use angle brackets and text" but rather that "shared information formats are really useful when supported by a reasonable set of tools". Given the immense bias in current XML work at the W3C toward support for the PSVI, it seems like it might well be time to find an appropriate means of expression for the PSVI. Conversions from strongly typed PSVI to loosely typed XML should be trivial, while XML to PSVI should only require a W3C XML Schema (or other PSVI generator) to provide the necessary information. PSVI processors could use or extend existing XML infrastructures, replacing only the bottom layer - the parser - and possibly developing its own structures for the layers above. I suspect that taking the PSVI to its fullest potential is going to involve a lot more work than taking untyped markup to its fullest potential. It's simply a larger set of problems. A binary PSVI format could sure make XML-RPC (PSVI-RPC?) messages a lot smaller. All it takes is a spec, some free parsers, and some tools. Maybe someday programmers will look back on XML as the bootstrap phase of the PSVI, while the occasional markup geek still pokes around CDATA sections. -- Simon St.Laurent Ring around the content, a pocket full of brackets Errors, errors, all fall down! http://simonstl.com
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








