|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: PSVI
From: James Robertson <jamesr@s...> >Can anyone give me a pointer to a _brief_ discussion of PSVI? The Post Schema Validation Infoset is the current name given to the XML Infoset after the document has been schema-processed. It is augmented. There are four or so basic choices (non-exclusive) for what a Schema language can do: -- transform the document, leaving the original in place: e.g. report Boolean valid, generate a set of links, pretty-print the document, generate code for interfaces -- provide type information for programs accessing the infoset, and allow optimized infosets to be built: e.g. so that a query on a number stores the query value in an int rather than a string, and uses number-based comparisons not text-based -- augment the infoset using the existing categories: e.g. the defaults in a DTD add special attribute values, but the fact of whether the attribute values were specified or defaulted or are fixed makes no difference to the normal API. (This augmentation could be implemented by lookup rather than inline if there is a 1-1 correspondence between node name and type. Even if there is no a 1-1 correspondence, the defaulting etc would usually not be explicit but implemented by pointing to a singleton representing the appropriate markup declarations.) -- augment the infoset with new kinds of information which do not correspond to any XML markup: e.g. allow nodes to have properties or types or facets, allow individual nodes to have validity status (or other outcomes) added per node, allow extra nodes corresponding to a data value after it has been normalized and put into some optimized form such as an array of int. (This augmentation could be implemented by lookup rather than inline if there is a 1-1 correspondence between node name and type. Even if there is no a 1-1 correspondence, the defaulting etc would usually not be explicit but implemented by pointing to a singleton representing the appropriate markup declarations.) It is this last augmentation that is the "Post Schema Validation Infoset" approach. XML Schemas takes this approach. There is no way to re-serialize the PSVI without altering the structure of the document drastically; however, some architectural forms etc could be constructed. There are some attempts to make a standard dump format for PSVI (Richard Tobin's has one and I think Jonathan Borden has an idea in the works too for RDF) but structure is no preserved in these. So every time the document is transmitted, it must be re-augmented; there is no way for a document to declare "don't augment me", though perhaps the SOAP/XML Protocols technology might be able to do something (I doubt they would--a bit fiddly). So I think it is important to disconnect the idea of schema augmentation and strong typing from the need for a PSVI proper: as I said above, there are augmentations possible which do not add any new information types (e.g. attribute defaulting) and schema queries can have strong typing built in. And casting _could_ be used in query languages to get strong typing even without a schema at all: e.g. <xsl:template match="person[@(date)birthday='1972/12/24']" >.. so we also should confuse that a schema is needed to get strong typing per se...just for automated selection of type. The proponents of PSVI say it is harmless, that we already do similar things in real DOMs (i.e. if we already have a pointer to the element/attlist declaration it can be substituted for a singleton holding the XML Schema PSV information), that there are implementation techniques that make it efficient, that it opens up the door for more sophisticated processing, and that it is required in XML Schemas because otherwise we cannot process substitution groups generically and because the presence of xsi:type means that a query writer cannot always rely on the schema to know the type because type may be explicitly specified (which would also cause the strong-typed query to fail). An augmented infoset can allow all sorts of nice error messages and information. The opponents of PSVI say it is harmful, that it forces an accross-the-board upgrade of technology with all the disruption and intereoperability problems that will involve, that it does not provide much additional functionality (remember we are disconnecting PSVI from the ability of schemas to autogenerate optimized interfaces or queries, and from simple augmentation of the XML infoset), that it may mean that existing non-PSVI specs are not maintained, that on the WWW we need to reduce the amount of information sent so PSVI systems work, that on lightweight devices it is too slow or big to be useful (and so we will end up with subsets of XPath, XSLT, XML Schemas and all PSVI specs anyway), that implied markup is bad practise in any case (i.e. it is OK to use substitution groups to allow similar elements in a location, but they must be processed indivually), and that xsi:type is a kludge that is only required because XML Schemas does not provide selection of type parameterized by attribute values (a.k.a. generalized markup). The PSVI is not geared to a world of small-lightweight (or heavy-load) communicating devices but to the old world of big fat centralized systems and clients. Furthermore, making it that an XSLT 2 script may use the PSVI means that a programmer (e.g. a maintanance programmer, or a beg/borrow/stealer) needs to understand XML Schemas--this adds significantly to the background knowledge required; furthermore, it is a betrayal of XML's basic premise that it (and by expectation its derived technologies) will be straightforward to use of the WWW and easy to implement. Furthermore, at least some people think that the lack of expressiveness in some major areas of XML Schemas means that it cannot claim to be a "universal" schema language, and so the PSVI does not provide enough bang/buck: I know a Schematron fan who thinks Schematron has eroded the areas where XML Schemas is the preferable schema language, and James Clark (as reported in xmlhack.com) has commented on some other features he thinks are important to model. Some people also may feel that the PSVI/XML Schemas is so complicated that it centralizes web technology into the hands of the privileged few (large companies and those funded by them, or Western countries in general) and so is fundamentally not a "people's technology"--we have left the DPH a long time ago: they may also feel that this complexity and over-completeness plays into the hands of the large commercial interests by making the technology too difficult for starts ups and, being too verbose for reading, almost guarantees that fancy GUIs must be used to present the schemas (creating a market for the tools-makers.) I think it is possible to hold a middle view: that the PSVI is certainly useful and appropriate for many applications (editors, fat systems) but that it is not appropriate (or not appropriate _now_) to abandon non-PSVI versions of specs for PSVI versions. That it would be more appropriate for other specs to make use of the other non-PSVI features made available by XML Schemas (as above: transformation, query typing, XML infoset augmentation.) I met several people at the W3C meeting in Boston who were very satisfied with XML Schemas; I don't recall that any of them actually required access to any of the new PSVI information (as distinct from information that could be expressed in the current DOM or XPath) however. So I don't think Simon's comment that many people don't like XML Schemas is so relevant to evaluating the current desirability of the PSVI (and Henry's comment that there are many people who like XML Schemas is similarly not to-the-point.) This is not so much an issue of XML Schemas but of how other specs make use of XML Schemas IYKWIM. By forcing existing W3C technologies to be based on XML Schema PSVI, we don't have a world where co-existance is possible. Hope this is useful and correct. Cheers Rick Jelliffe
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








