[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: XML Schemas: Best Practices
Hi Folks, [I have created an online version of yesterday's discussion of Global versus Local. It is at: http://www.xfront.com/GlobalVersusLocal.html. I will be updating it with todays excellent messages.] Recall yesterday we examined two polar opposite design approaches for dealing with the issue of when to declare an element/type globally and when to declare it locally. The two design approaches are briefly described here: (1) Create a single component which contains nested components (boxes within boxes) (2) Create individual components and aggregate them together (separate boxes) [Thanks to Toivo for the box analogy!] We also discussed the characteristics of each design approach. The most noteworthy characteristic of each design is: (1) The first design approach facilitates hiding (localizing) namespace complexities within the schema (2) The second design approach facilitates component reuse. After sending the message yesterday I realized that the discussion tended to imply that you could have one or the other but not both (i.e., your schema could hide namespace complexities or it could have component reuse, but not both). Today Caroline Clewlow pointed out that you can have both [thanks Caroline!]. Let's consider the Book example again: <Book> <Title>Illusions</Title> <Author>Richard Bach</Author> </Book> How can we hide (localize) the namespaces of Title and Author, and yet also have component reuse? Caroline's suggestion is to create a global type definition and nest the Title and Author declarations within it: <complexType name="Publication"> <sequence> <element name="Title" type="string" minOccurs="1" maxOccurs="1"/> <element name="Author" type="string" minOccurs="1" maxOccurs="1"/> </sequence> </complexType> <element name="Book" type="cat:Publication"/> With this design approach we obtain both the benefits of: - hiding (localizing) the namespace complexity of Title and Author - reuse of the Publication type We can carry this idea a step further. We can obtain even greater reuse (and still retain the benefit of hiding namespace complexity) by creating type definitions for Title and Author: Third Design: <simpleType name="Title"> <restriction base="string"> <enumeration value="Mr."/> <enumeration value="Mrs."/> <enumeration value="Dr."/> </restriction> </simpleType> <simpleType name="Name"> <restriction base="string"> <minLength value="1"/> </restriction> </simpleType> <complexType name="Publication"> <sequence> <element name="Title" type="cat:Title" minOccurs="1" maxOccurs="1"/> <element name="Author" type="cat:Name" minOccurs="1" maxOccurs="1"/> </sequence> </complexType> <element name="Book" type="cat:Publication"/> This design has: - maximized reuse (there are four reusable components - the Title type, the Name type, the Publication type, and the Book element) - maximized the potential to hide (localize) namespaces [note how I phrased this: "maximize the potential ?" Whether, in fact, the namespaces of Title and Author are hidden or exposed, is determined by the elementFormDefault "switch"]. This is quite nice! I see general design guidelines emerging ? - Design your schema to maximize the potential for hiding (localizing) namespace complexities. - Use elementFormDefault to act as a switch in controlling namespace exposure - if you want element namespaces exposed in instance documents, simply turn the elementFormDefault switch to "on" (i.e, set elementFormDefault= "qualified"); if you don't want element namespaces exposed in instance documents, simply turn the elementFormDefault switch to "off" (i.e., set elementFormDefault="unqualified"). - Design your schema to maximize reuse. - Use type definitions as the main form of component reuse. Place element declarations within type definitions - in so doing, you maximize the potential for hiding (localizing) namespace complexities. Let's now compare this design approach (which I have called the Third Design) with the design approach that we presented yesterday, which I called the Second Design (the separate boxes design). Recall the example that was given yesterday: Second Design: <element name="Title" type="string"/> <element name="Author" type="string"/> <complexType name="Publication"> <sequence> <element ref="cat:Title" minOccurs="1" maxOccurs="1"/> <element ref="cat:Author" minOccurs="1" maxOccurs="1"/> </sequence> </complexType> <element name="Book" type="cat:Publication"/> The Second Design approach maximizes reuse but it has absolutely no potential for namespace hiding. You might argue, "hey, suppose that I want to expose namespaces in instance documents (and we have seen cases where this is desired), so this is a good design for me." Let's think about this for a moment. What if at a later date you change your mind and wish to hide namespaces (what if your users hate seeing all those namespace qualifiers in the instance documents)? You will need to redesign your schema (possibly scraping it and starting over). Better to adopt the Third Design approach. That way you control whether namespaces are hidden or exposed by simply setting the value of elementFormDefault. No redesign of your schema is needed as you switch from exposing to hiding, or vice versa. [That said ? your particular project may need to sacrifice the ability to turn on/off namespace exposure because you require instance documents to be able to use synonyms/aliases. In such circumstances the Second Design approach may be the only viable alternative.] Here is a brief summary of the characteristics of the Third Design approach. Third Design Characteristics: [1] Maximum reuse. The primary component of reuse are type definitions. [2] Maximum namespace hiding. Element declarations are hidden within the types. This list of characteristics needs to be expanded on. Can someone give more of the characteristics of this Third Design approach? [We really need to give names to these design approaches. Here are the suggestions which have been put forward: First Design (the boxes within boxes design): - The Compact Design, or - The Hierarchical Design, or - The Russian Doll Design, or - Composition Second Design (the separate boxes design): - The Global Scope Design, or - The Componentized Design. or - The Salami Design, or - The Shared Aggregation Design Third Design (the maximize reuse and namespace hiding design): - ??? Do you like the ones suggested? If so, which ones? If not, what do you suggest? (I am eager to see what names people come up with for the Third Design approach!) Let me know your preferences. Otherwise, yours truly will select a name.] ? In today's mail, Jon Cleaver pointed out a difference between the First and Second Design approaches in terms of their impact on component changes. I have summarized his comments below [Jon, let me know if I have not accurately captured your comments]: First Design: (the boxes within boxes design) Localized change impact. With this design approach there is, for all practical purposes, just a single component. The fact that the component contains other components is somewhat irrelevant to other schemas since those inner components are not accessible (reusable). Hence, if the Book component changes (i.e., the components within it change) it will have a relatively limited impact since there is only "one" component changing. Second Design: (the separate boxes design) Far-reaching change impact. The sheer number of reusable components means that there is more likelihood of any changes to Book impacting many schemas, since the components may be being used in many places and in many schemas. ? In this message we have examined a third design approach for dealing with the issue of when to declare elements/types globally versus when to declare them locally. Here are some things that need further discussion: - What are your thoughts on this third design approach? - What are the pros and cons of this third design approach? - Do you prefer it to the other two design approaches? - What name would you give to the third design approach? [We have covered a lot of material over the last two days! Is it all making sense? Are there enough examples that the concepts are clear, or should more examples be given? For example, does everyone know what it means when I say that elementFormDefault can behave like a "switch", turning on or off namespace exposure in instance documents? This might not be clear to everyone since this is my own, concocted terminology. Let me know if I should explain this further.] Great discussions!!! Thanks!!! /Roger
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|