[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Schema Extensibility
For a while I have been continuing a thread which started out thinking about versioning of XML schema types, in particular enums. The debate broadened and a variety of helpful and interesting views were voiced about versioning in general and as a related subject extensibility. Personally I have been relating these comments to XML schema structures but I could have easily been talking about the service interface supported by those schema. This has highlighted some different opinions about the value of various approaches to this problem which I hope have resonated with those following the thread. I have become quite interested in the UBL work that Ken Holman has introduced and the position UBL is taking about the separation of the validation of structural conformance versus value based. I guess the thing that I am still mostly undecided about is to do with whether to allow for schema extensibility (using xs:any together with the 'sentry' approach proposed by David Orchard (and others) or whether this is a recipe for an uncontrollable vocabulary. I think the battle-ground is in part characterised by a schema (or service) that, once published is considered as immutable, hence any changes REQUIRE a NEW VERSION with a NEW NAMESPACE, versus a schema which allows non breaking changes to be introduced by both the schema owner and non schema authors and supports both forward and backwards compatibility. The first situation is a 'clean' and explicit model where the semantics are guaranteed not to be usurped by a non schema owner but where even relatively minor change requirements can have a large impact to implementations (especially when there are a large number of external users of this vocabulary). Changes often take a relatively long while to surface through into the standard and this may impact business priorities. Versioning is enabled through support for one or more of the available schema where, from time to time, old versions may be deprecated. The schema extensibility approach promotes the idea that organisations may want to represent private relationships using data carried at specified points within the standard schema in such a way that that data is only relevant between those parties (using a foreign namepsace) and all others can safely ignore it (and that the schema author should not necessarily attempt to constrain this type of usage). It recognises that the pace of change to a standard schema often lags behind the operational requirements of user organisations, but those organisations don't want to throw out the whole standard and 'go private'. It can imply that some TP extensions may be incorporated back into the main body of the standard at a later point in which case anyone pair or parties using that extension can agree a move back to the standard definition, at a time of their choosing. It also allows the schema owner to add non breaking 'compatible' change to a schema. The down sides seem to be, that a TP could introduce changes which subvert the intended semantics, and that, over time, what might have started out as a temporary expedient, turns into an entrenched working implementation that is unlikely to be allocated budget to be re-synchronised with the standard. So, in part the question is, should a schema allow for unknown extensions for unknown purposes (but in specified locations) and still be considered as 'compliant', or should schema authors attempt to constrain (eliminate) that behaviour. I can't help feeling the attraction of the second model, but my 'gut' tells me that something as inflexible will soon become a business constraint and that will signal it's demise. With my SOA hat on I would recognise the importance of interoperability and the significant role that standardised vocabularies have to play. I also don't especially want to promote the myriad of point-to-point relationships that 'going private' implies and instead want to leverage the 'reach' of a market standard. Personally I still have no definative conclusion that I feel comfortable in turning into a recommended approach within my own organisation and within the industry standards body that I work with from time to time, so I thought I'd give it one more go. Some of the issues and comments highlighted by the earlier thread are provided below. Some are direct quotes from contributors, others are excepts from various ramblings :-) Cheers Fraser ======================== - extensibility is a critical aspect of any data [or service] model. Without extensibility all changes (however minor) effectively 'break' all provider and consumer implementations. - there are no 'minor' changes, any change implies a semantic difference. - backwards compatible yes (the previous version of a schema must be a valid instance of the new version), but not necessarily the other way around - xs:any together with the 'sentry' approach proposed by David Orchard (and others) provides a mechanism that allows XML schema to be extended by both the schema namespace owner and a non schema author independantly, in a manner which supports forwards and backwards compatibility for instance documents. That is, some category of change can be accomodated which do NOT cause either the consumer or provider implementation to REQUIRE change. Of course extensions added by non schema owners represent a private relationship between the communicating parties and therfore require an out of band exchange of the type definitions and semantics. Also such extensions can only be applied to specific locations in the base schema AND using a foreign namespace. This is sometimes referred to as the 'must ignore' pattern. - A 'big bang' approach to versioning is not usually achievable in any practical sense. That is, it is generally not possible to enforce a 'breaking' change on all users of a schema/service simultaneously (or even within a constrained time window). - Support for a version of a schema/service can in some cases be self regulating. That is, if provider A only supports version 1.0 of a service whilst the majority of consumers expect to be able to integrate with version 1.1 (or 2.0), then chances are that provider A will be unable to win any business and will therefore be forced to upgrade. If a consumer supports version 1.0 but all potential [preferred] providers have upgraded to a later version, the consumer may not be able to place any business on behalf of its customers, and will therefore be forced to upgrade (assuming that version 1.0 and later versions are NOT backwardsly compatible). - a schema or service interface is immutable. Once published it should never be changed (perhaps this is better stated as the operations which make up the service interface should never be changed). - support for concurrent versions of a schema/service is more effective method of dealing with change than through schema extensibility. It makes versions explicitly typed without the ambiguity of untyped sections (xs:any) which require some out of band mechanism to be entered into by each participant. Implementing an explicit new version has the crucial advantage that it is guaranteed NOT to break a consumer implementation using the current vesion unless the provider removes that version. - Any change to a schema represents a semantic difference and therefore cannot be considered as 'minor' and therefore requires a new version. - We have come to the conclusion that semantically the definition of an enumerated field is its enumerations. Therefore changing the enumerations changes the definition. Adding enumerations locally seems like a poor practice. - Adding a new value to a enumeration is not a compatible change if that value could be returned to a consumer who currently doesn't know about it (using the previous schema definition). If it's just of the receiving side, it MAY be compatible since the previous version remains a valid sub-set. - schema's defined and managed by a standards body often move too slowly to accomodate the business priorities of particpants. Allowing local extensions can enable an organisation to gain advantage from the broader 'reach' of the base standard to the majority of its partners whilst supporting specific third party relationships which require additional [private] data not [currently] available within the base standard. Sometimes this additional data can represent a 'candidate' standard which may be encorporated at a future time. - When standards become an inhibitor to business operations they will be usurped by local arrangements. - Value based validation can be implemented as a separate layer, on top of structural conformance. - Synchronisation of schema variants is necessary at the point when the number of variants indicates that the original semantics may have become obfusticated or a new semantic ecosystem [related] is emerging. - If a large number (more than 1 :-) of buisness transactional schema include a common complex type, and that complex type needs to be changed, this can create a synchronisation problem. So is there a differnt approach to dealing with versioning of shared types ? - We are undertaking a new position where the schema are going to be used solely for structural validation, and code list value validation (as agreed upon by trading partners) is a separate step.
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|