|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: DTDs, W3C Schemas, RELAX NG, Schematron?
> John Cowan wrote: > > > 9) RNG is closed under union and intersection: given two document classes > > described by RNG schemas, one can mechanically construct an RNG schema > > which describes documents appearing in both classes, and another which > > describes documents appearing in either class. > > Did we lose set difference? The thing that knocked me out years ago > when Murata started talking about set operations on hedge automata was > the result of Schema A - Schema B. Here's the application: you rev a > schema, then take the difference of version N - version N+1, and get a > schema for the class of documents you've just broken. -Tim Difference is also important for static type-checking. If you want to statically check that the output of some program with some input schema is guaranteed to be valid with respect to some output schema, then firstly you need to compute a schema that represents the result of the program when applied to a document with the input schema and then you need to compute whether that schema is a subset of the output schema. (You can also do it the other way round by computing the schema for the class of documents that when the program is applied to them produce documents conforming to the output schema and then check that this schema is a subset of the input schema.) Checking that S1 is a subset of S2 is the same as checking that the difference of S1 and S2 is empty. The question as regards intersection and difference is rather more complex than the question as regards union. With union, it's dead simple: if you have two RELAX NG schemas, s1.rng and s2.rng, then <choice> <externalRef href="s1.rng"/> <externalRef href="s2.rng"/> </choice> is a schema that matches the union. However, RELAX NG doesn't have general intersection and difference operators. With intersection and difference, there's a theoretical level at which the issue is simple. If you consider only elements (i.e. ignore attributes and text), then the class of language that RELAX NG schemas (like RELAX Core) can represent is regular tree languages. And it is a well-known theoretical result that regular tree languages are closed under intersection and difference. However, there is a big gap between theory and a practical application: - how big is the schema representing the intersection/difference? It might be too big to be practical. - how long does it take to compute the schema for the intersection/difference? - what about attributes? - what about typed data? I don't know the answers to these questions. The good news is that there's a very smart researcher, Haruo Hosoya, who created XDuce for his PhD, a large part of which is dealing with computing subset relationships for a type system which is very similar to RELAX NG. His last release of XDuce (http://xduce.sourceforge.net) adds support for attributes using the same approach as RELAX NG. My best guess of the situation is: - it should be possible is to create a practical application that computes the intersection or difference of two RELAX NG schemas; XDuce is pretty close to doing just this (however, it's written in O'Caml, which is a very powerful, but rather non-mainstream language); - it is not trivial to write such an application; - there may well be some exceptional cases, where the application couldn't do it at all, or where the result would be impractically large, or where it would take impractically long to compute the result. Hosoya-san or Murata-san would be able to give a more informed answer to this. James
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








