[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: XML Schemas: Best Practices
"Arnold, Curt" wrote: > ... projection pattern ... aggregation pattern ... decorator pattern After reading your message Curt, I studied and implemented the design patterns - projection, aggregation, and decorator. I discovered that each pattern could be implemented using the three methods that I described in Wednesday's message. Implementing the three methods for each pattern served to be very useful - it brought clarity to the issue. In implementing each of the patterns I found the same question arising: What is the Best Practice for implementing a container element that is to be comprised of variable content? - For the projection pattern the question was how to implement variable content comprised of specialized as well as generic elements. - For the aggregation pattern the question was how to implement specialized variable content that was embedded within a generic element. - For the decorator pattern the question was how to implement specialized variable content which contained a generic element. As I see it, the patterns are an instance document Best Practice issue (what's the best way to design an instance document), whereas the 3 implementation methods are a schema Best Practice issue (what's the best way to design a schema). Thus, for this discussion I would like to focus on the methods rather than on the patterns. Below I have summarized the three methods and incorporated the excellent points that Curt and Len made on the pros and cons of each method. There are several questions remaining, which I have interspersed in the summary. SUMMARY Problem Statement. Design an XML Schema for a container element (Catalogue) which is to be comprised of variable content (Book, or Magazine, or ...) <Catalogue> - variable content - </Catalogue> Ideally, the components in the variable content section may come from disjoint sources, i.e., from other, independently developed schemas. Example of <Catalogue> containing variable content: <Catalogue> <Book> ... </Book> <Magazine> ... </Magazine> <Book> ... </Book> </Catalogue> Below are three methods for implementing Catalogue. ****************************************************************** Method 1. Use an abstract element and element substitution to implement variable content. Method Description: There are four XML Schema concepts that must be understood for implementing this method: - an element can be declared abstract. - abstract elements cannot be instantiated in instance documents. - in instance documents the abstract element must be substituted by non abstract elements which are in a substitutionGroup with the abstract element. - elements may be in the substitutionGroup with the abstract element iff their type is the same as, or derives from the abstract element's type. Method Implementation: Declare an abstract element (Publication): <element name="Publication" abstract="true" type="c:PublicationType"/> Declare the container element (Catalogue) to have as its contents the abstract element: <element name="Catalogue"> <complexType> <sequence> <element ref="c:Publication" maxOccurs="unbounded"/> </sequence> </complexType> </element> Declare the elements that are to be in the variable content section (Book and Magazine) and put them in a substitutionGroup with the abstract element: <element name="Book" substitutionGroup="c:Publication" type="c:BookType"/> <element name="Magazine" substitutionGroup="c:Publication" type="c:MagazineType"/> In order for Book and Magazine to substitute for Publication, BookType and MagazineType must derive from PublicationType. Here are the type definitions: PublicationType - the base type: <complexType name="PublicationType"> <sequence> <element name="Title" type="string"/> <element name="Author" type="string" maxOccurs="unbounded"/> <element name="Date" type="year"/> </sequence> </complexType> BookType - extends PublicationType by adding two new elements, ISBN and Publisher: <complexType name="BookType"> <complexContent> <extension base="c:PublicationType" > <sequence> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/> </sequence> </extension> </complexContent> </complexType> MagazineType - restricts PublicationType by dropping the Author element: <complexType name="MagazineType"> <complexContent> <restriction base="c:PublicationType"> <sequence> <element name="Title" type="string"/> <element name="Author" type="string" minOccurs="0" maxOccurs="0"/> <element name="Date" type="year"/> </sequence> </restriction> </complexContent> </complexType> Method Advantages: - This method allows you to easily extend the set of elements that may be used in the variable content section simply by adding new elements to the abstract element's substitutionGroup. Method Disadvantages: - The type of the elements that are to be used in the variable content section must all descend from the abstract element's type. Further, the elements must be in a substitutionGroup with the abstract element. These requirements represent severe restrictions on the usefulness of this method. The variable content section cannot contain elements whose type does not derive from the abstract element's type, or is not in a substitutionGroup with the abstract element - as would typically be the case with independently developed components. For example, suppose another schema author creates a "Newspaper" element, with a type that does not descend from PublicationType, nor is it in the substitutionGroup with Publication. Thus, <Catalogue> would not be able to contain the <Newspaper> element. The elements in the variable content section are all tied to the same type hierarchy tree. Thus, they are dependent and coupled. - Oftentimes the variable content section will contain elements that are conceptually related but structurally vastly different. The base type (the abstract element's type) should contain items common to all the variable content elements. To allow for elements that may be very dissimilar the base type would need to have very little structure. This defeats the purpose of inheritance. Question: - In the second disadvantage above I state: "This defeats the purpose of inheritance." This seems like a very weak statement. Can you provide a stronger statement telling why it is bad that the base type has little structure? - Have you noticed that I like to name things? Well, I would like to put a name to this method (and to all three methods). Any suggestions? ****************************************************************** Method 2. Use a repeatable <choice> element to achieve variable content. Method Description: This method is quite straightforward - simply list within a <choice> element all the components which can appear in the variable content section, and embed the <choice> element in the container element. Method Implementation: Declare within a <choice> element all the elements that may appear in the variable content section (Book, Magazine). Embed the <choice> element within the container element (Catalogue): <element name="Catalogue"> <complexType> <choice minOccurs="0" maxOccurs="unbounded"> <element ref="c:Book"/> <element ref="c:Magazine"/> </choice> </complexType> </element> <element name="Book" type="c:BookType"/> <element name="Magazine" type="c:MagazineType"/> Method Advantages: - The elements in the variable content section do not need a common type ancestry. Thus, the variable content section can contain dissimilar, independent, loosely coupled elements. Method Disadvantages: - The <choice> element allows you to group together dissimilar elements. While that has been touted as an advantage, it is really a double edged sword. The elements in the variable choice section have no type hierarchy to bind them together, to provide coherence among the elements. - With method 1 you can easily extend the set of elements that may be used in the variable content section by creating a new element and putting it in the substitutionGroup with the abstract element. Immediately instance documents could then start using the new element. With method 2, in addition to creating the new component, you must also list the element in the <choice> element. So method 2 requires a two-step process to adding a new element to the set of elements available in the variable content section. This is a bit more error prone. Questions: - I am not sure that I believe the last sentence: "This is a bit more error prone." Do you? - Curt, you stated in your message that the disadvantage of this method is, "does not let people to extend your schema easily." Can you please elaborate on what you mean by this? - Again, I would like to see a name for this method. Suggestions? ****************************************************************** Method 3. Use an abstract type and type substitution to achieve variable content. Method Description: There are three XML Schema concepts that must be understood for implementing this method: - a complexType can be declared abstract. - an element declared to be of an abstract type cannot have its content instantiated in instance documents (the element can be instantiated, but its content may not). - in instance documents the element with the abstract type must have its content substituted by content from a non abstract type which derives from the abstract type. Method Implementation: Define an abstract base type (PublicationType): <complexType name="PublicationType" abstract="true"> <sequence> <element name="Title" type="string"/> <element name="Author" type="string" maxOccurs="unbounded"/> <element name="Date" type="year"/> </sequence> </complexType> Declare the container element (Catalogue) to contain a base element (Publication), which is of of the abstract base type: <element name="Catalogue"> <complexType> <sequence> <element name="Publication" type="c:PublicationType" minOccurs="0" maxOccurs="unbounded"/> </sequence> </complexType> </element> In instance documents, the content of <Publication> can only be of a non abstract type which derives from PublicationType, such as BookType or MagazineType (we saw these type definitions in Method 1 above). With this method instance documents will look different than we saw with the above two methods. Namely, <Catalogue> will not contain variable content. Instead, it will always contain the same element (Publication). However, that element will contain variable content: <Catalogue> <Publication xsi:type="Book"> ... </Publication> <Publication xsi:type="Magazine"> ... </Publication> <Publication xsi:type="Book"> ... </Publication> </Catalogue> Method Advantages: - Similar benefits to method 1. Namely, this method allows you to easily extend the set of elements that may be used in the variable content section simply by creating new types which derive from the abstract base type. Method Disadvantages: - Similar weaknesses to method 1. Namely, all types must descend from the abstract type. This requirement prohibits the use of types which do not descend from the abstract type, as would typically be the situation when the type is in another, independently developed schema. - This method has the additional weakness of not being as "clean" as the other methods in the instance documents, e.g., <Publication xsi:type="Book"> is not as clean as <Book> Questions: - The second disadvantage listed above is mighty weak. "Clean" is subjective. Can you think of a stronger statement? - Name for this method? Wrap-up Questions: What would be your recommendation for "Best Practice for implementing a container element that is to be comprised of variable content?" Which of the above methods would you recommend using? Based upon the above discussion I am tempted to recommend: "use method 2 - repeatable <choice> element - because it enables the variable content section to contain components from disjoint sources". I feel that this benefit outweighs its disadvantages. What are your thoughts on this? This is a pretty cool issue. Thanks a lot Curt and Len for shedding light on the pitfalls and advantages of each method! /Roger
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|