[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: The <any/> element: bane of security or savior of versioni
Hi Folks, Below is an approach for creating schemas that are backward and forward compatible without using the <any/> element. The key to this approach is using Schematron to validate extensions. First I describe the approach, then I list its advantages and disadvantages, and then I solicit your thoughts on this approach. CREATING BACKWARD-FORWARD COMPATIBLE SCHEMAS WITHOUT USING THE <any/> ELEMENT The approach will be demonstrated using a Book example. I will show three versions of the Book schema, each version an extension of the previous version. The version #1 Book schema creates an optional <Element> element into which future extensions can be placed: <element name="Book"> <complexType> <sequence> <element name="Title" type="string"/> <element name="Author" type="string"/> <element name="Date" type="date"/> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/> <element name="Element" minOccurs="0" maxOccurs="unbounded"> <complexType> <sequence> <element name="Name" type="string"/> <element name="Value" type="string"/> <element name="Datatype" type="string"/> </sequence> </complexType> </element> </sequence> </complexType> </element> The contents of Book is: Title, Author, Date, ISBN, Publisher and an optional Element. Here's a sample XML instance: <Book> <Title>My Life and Times</Title> <Author>Paul McCartney</Author> <Date>1998</Date> <ISBN>1-56592-235-2</ISBN> <Publisher>McMillan Publishing</Publisher> </Book> ... Time elapses. It is decided to update the Book schema. In addition to providing the title, author, date of publication, isbn, and publisher information, we also want XML instance to contain information about the number of pages in the book. The first (extension) <Element> will hold the NumPages information. A Schematron rule is used to validate that this is the case: <element name="Book"> <complexType> <sequence> <element name="Title" type="string"/> <element name="Author" type="string"/> <element name="Date" type="date"/> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/> <element name="Element" minOccurs="0" maxOccurs="unbounded"> <complexType> <sequence> <element name="Name" type="string"/> <element name="Value" type="string"/> <element name="Datatype" type="string"/> </sequence> </complexType> </element> </sequence> </complexType> </element> <annotation> <appinfo> <sch:pattern name="Book Extensions"> <sch:rule context="bk:Book/bk:Element[1]"> <sch:assert test="bk:Name='NumPages' and bk:Datatype='nonNegativeInteger'"> NumPages is the first extension information item </sch:assert> </sch:rule> </sch:pattern> </appinfo> </annotation> Here's a sample XML instance: <Book> <Title>My Life and Times</Title> <Author>Paul McCartney</Author> <Date>1998</Date> <ISBN>1-56592-235-2</ISBN> <Publisher>McMillan Publishing</Publisher> <Element> <Name>NumPages</Name> <Value>345</Value> <Datatype>nonNegativeInteger</Datatype> </Element> </Book> This instance will validate against the version #1 schema as well as the version #2 schema. Further, the version #1 instance shown above will validate against this new schema. ... More time elapses. It is decided to update the Book schema again. We want XML instances to also provide an indication of whether the Book is hardcover. The second <Element> will hold the Hardcover information. A second Schematron rule is added to validate that this is the case: <element name="Book"> <complexType> <sequence> <element name="Title" type="string"/> <element name="Author" type="string"/> <element name="Date" type="date"/> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/> <element name="Element" minOccurs="0" maxOccurs="unbounded"> <complexType> <sequence> <element name="Name" type="string"/> <element name="Value" type="string"/> <element name="Datatype" type="string"/> </sequence> </complexType> </element> </sequence> </complexType> </element> <annotation> <appinfo> <sch:pattern name="Book Extensions"> <sch:rule context="bk:Book/bk:Element[1]"> <sch:assert test="bk:Name='NumPages' and bk:Datatype='nonNegativeInteger'"> NumPages is the first extension information item </sch:assert> </sch:rule> <sch:rule context="bk:Book/bk:Element[2]"> <sch:assert test="bk:Name='Hardcover' and bk:Datatype='boolean'"> Hardcover is the second extension information item </sch:assert> </sch:rule> </sch:pattern> </appinfo> </annotation> Now the contents of Book is: Title, Author, Date, ISBN, Publisher, and the first Element contains information about the NumPages, the second Element contains information about whether it is a Hardcover book. Here's a sample XML instance: <Book> <Title>My Life and Times</Title> <Author>Paul McCartney</Author> <Date>1998</Date> <ISBN>1-56592-235-2</ISBN> <Publisher>McMillan Publishing</Publisher> <Element> <Name>NumPages</Name> <Value>345</Value> <Datatype>nonNegativeInteger</Datatype> </Element> <Element> <Name>Hardcover</Name> <Value>true</Value> <Datatype>boolean</Datatype> </Element> </Book> This instance will validate against the version #1 schema as well as the version #2 schema as well as the version #3 schema. In fact, all instances will validate against all schemas. There is backward and forward compatibility among all schema versions! NOTES: 1. I embedded the Schematron stuff within the XML Schema document. Alternatively, I could put the Schematron stuff in a separate document. 2. I specified the datatype of the extension elements in a <Datatype> element. Alternatively, I could use xsi:type, e.g. <Value xsi:type="xs:nonNegativeInteger">345</Value> ADVANTAGES OF THIS APPROACH Compare the <any/> element to achieving backward-forward compatibility versus the approach described above: (a) The <any/> element permits any string or any child element, which can contain anything. (b) The approach described above constrains extensions just to the <Element> element. In the above example I allowed an unbounded number of <Element> occurrences, but I could easily have put a limit on the number of extensions by specifying a numeric value for maxOccurs. Also, in the above example I allowed <Name>, <Value>, and <Datatype> to hold any string, but I could easily constrain each of them as well. Thus the extensibility is easily controlled. Assertion: the approach being described in this message represents a more controlled, safer approach to achieving backward-forward compatible schemas than a strategy which uses the <any/> element. Thus, the approach being described in this message allows the creation of backward-forward compatible schemas that are also safe. DISADVANTAGES OF THIS APPROACH The approach depends of the use of both XML Schemas and Schematron to express the constraints. Thus a person who wishes to use this approach must be fluent with both schema languages. Also, it means that two tools are needed to validate XML instances - an XML Schema validator and a Schematron validator. The extensions appear a bit "different". Rather than the XML instances appearing as <NumPages>345</NumPage> they appear as <Element> <Name>NumPages</Name> <Value>345</Value> <Datatype>nonNegativeInteger</Datatype> </Element> The value of the <Name> element is the element name, the value of the <Value> element is the element value, and the value of the <Datatype> element is the element's datatype. I believe that this approach limits extensions to only simple values. Rick Jelliffe: when is Schematron going to have the ability to do datatype assertions, e.g. "The value of the <Value> element is of datatype xs:nonNegativeInteger"? QUESTION What other advantages and disadvantages do you see for the above approach? /Roger
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|