[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: XML Schemas: Best Practices
is there any way you can put this info onto a web page and send out the url? rjsjr > -----Original Message----- > From: Roger L. Costello [mailto:costello@m...] > Sent: Wednesday, September 20, 2000 4:13 PM > To: xml-dev@l... > Cc: costello@m...; Pulvermacher,Mary K.; Heller,Mark J.; > JohnSc@c...; Ripley,Michael W. > Subject: Re: XML Schemas: Best Practices > > > Hi Folks, > > Over the last couple of days I have had the good fortune to talk with > some very bright people about one of the issues that was raised here. > Namely, we discussed: what guidelines can be given with regards to > whether an element should be declared locally versus globally? During > the discussions some excellent points were made about the benefits of > hiding schema complexity by using local elements in combination with > setting elementFormDefault="unqualified". Below I have done my best to > describe the points that were made during the discussions. > > Hiding Namespace Complexities using Local Element Declarations and > elementFormDefault="unqualified" > > First some notes: > > (a) A typical schema will utilize elements and types from many different > namespaces. > (b) It is desirable to shield instance documents from the intricacies of > schemas. > > One such schema intricacy that we would like to hide is the namespaces > of all the different components being used by a schema. Oftentimes it > is irrelevant to the instance document where the schema obtained its > components. It would like for such things to be kept hidden in the > schema. By declaring elements locally and by setting > elementFormDefault="unqualified" we can prevent the schema namespace > complexities from sneaking into instance documents. How to do this is > shown next. > > Example. Consider this schema for <camera>, where the <body> element is > defined in the Nikon schema, the <lens> element is defined in the > Olympus schema, and the <manual_adaptor> element is defined in Pentex > schema: > > <?xml version="1.0"?> > <schema xmlns="http://www.w3.org/1999/XMLSchema" > targetNamespace="http://www.camera.org " > elementFormDefault="unqualified" > xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance" > xsi:schemaLocation= > "http://www.w3.org/1999/XMLSchema > http://www.w3.org/1999/XMLSchema.xsd" > xmlns:nikon="http://www.nikon.com" > xmlns:olympus="http://www.olympus.com" > xmlns:pentex=http://www.pentex.com> > <import namespace= http://www.nikon.com > schemaLocation= "Nikon.xsd"/> > <import namespace= http://www.olympus.com > schemaLocation= "Olympus.xsd"/> > <import namespace= http://www.pentex.com > schemaLocation= "Pentex.xsd"/> > <element name="camera"> > <complexType> > <sequence> > <element ref="nikon:body" minOccurs="1" > maxOccurs="1"/> > <element ref="olympus:lens" minOccurs="1" > maxOccurs="1"/> > <element ref="pentex:manual_adaptor" minOccurs="1" > maxOccurs="1"/> > </sequence> > </complexType> > </element> > </schema> > > Here's an example of a conforming instance document: > > <?xml version="1.0"?> > <my:camera xmlns:my="http://www.camera.org" > xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance" > xsi:schemaLocation= "http://www.camera.org Camera.xsd"> > <body>Ergonomically designed casing for easy handling</body> > <lens>300mm zoom, 1.2 f-stop</lens> > <manual_adaptor>1/10,000 sec to 100 sec</manual_adaptor> > <my:camera> > > The instance document is simple. There are no namespace qualifiers > cluttering up the document, except for the one on camera (which is okay > because it shows the namespace for the document as a whole). The > instance document simply shows the components of camera - camera is > comprised of body, lens, and manual_adaptor. The fact that the schema > gets these three components from different namespaces is irrelevant and > hidden within the schema > > Consider now what the instance document would look like if the schema > had declared elementFormDefault= "qualified". Recall that > elementFormDefault="qualified" means that in the instance document all > elements must be qualified. Look at the resulting instance document: > > <?xml version="1.0"?> > <my:camera xmlns:my="http://www.camera.org" > xmlns:nikon="http://www.nokia.com" > xmlns:olympia="http://www.olympia.com" > xmlns:pentex="http://www.pentex.com" > xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance" > xsi:schemaLocation="http://www.camera.org Camera.xsd"> > <nikon:body>Ergonomically designed casing for easy > handling</nikon:body> > <olympia:lens>300mm zoom, 1.2 f-stop</olympia:lens> > <pentex:manual_adaptor>1/10,000 sec to > 100 sec</pentex:manual_adaptor> > <my:camera> > > This instance document is much more complex - it explicitly shows where > all of the components come from. The complexities of the schema are > thus sneaking into the instance document. > > Note, then, that to hide namespace complexity it is not simply a matter > of declaring elements locally, but it is also important that > elementFormDefault be set appropriately (to the value of > "unqualified"). > > Let's try to summarize the principles that have been described here: > > [1] Schema authors should design their schemas such that the > complexities of the schema do not show up in the instance documents. > One such complexity is the namespaces of the schema components (i.e., > where all the components come from). > [2] The combination of declaring elements locally and setting > elementFormDefault="unqualified" can be used to hide where all the > components come from (i.e., the namespaces). Thus, that schema > complexity is not transferred to the instance document. > > This discussion has argued for keeping hidden in the schema the location > (namespace) of the components. However, there are scenarios where it is > desirable to make such namespaces explicit, i.e., we want the instance > documents to explicitly show where the components come from. Would > anyone care to make a case for that? > > /Roger > > "Roger L. Costello" wrote: > > > > Hi Folks, > > > > I would like to see if we can collectively come up with a set of "best > > practices" in designing XML Schemas. I realize that the specifics of > > designing a schema are heavily dependent upon the task at hand. > > However, I firmly believe that there are guidelines that can be employed > > in creating a schema, and those guidelines hold true irrespective of the > > specific task. It is this set of guidelines that I am hoping we can > > shed some light upon. > > > > I would like to get things started by listing some of the things that > > must be considered in designing a schema. It is by no means an > > exhaustive list. For example, it doesn't address when to block a type > > from derivation, when to create a schema without a namespace, when to > > make an element or a type abstract, etc. Nonetheless, it is a start to > > some hopefully useful discussions. > > > > First, a quick list of the issues: > > > > [1] Element versus Type Reuse > > [2] Local versus Global > > [3] elementFormDefault - to qualify or not to qualify > > [4] Evolvability/versioning > > [5] One namespace versus many namespaces (import verus include) > > [6] Capturing semantics of elements and types > > > > Now, details of each issue: > > > > [1] Element versus Type Reuse: from my own experience in building > > schemas I have found that it is oftentimes not obvious whether to > > declare something as an element and then reuse that element, or to > > declare it as a type and reuse the type. Let's consider the two cases > > by looking at an example: > > > > Element Reuse > > > > - Declare an element for reuse: > > > > |<element name="Elevation"> > > | <simpleType base="integer"> > > | <minInclusive value="-1290"/> > > | <maxInclusive value="29028"/> > > | </simpleType> > > |</element> > > > > - Reusing the element: > > > > |<element name="Boston"> > > | <complexType> > > | <sequence> > > | <element ref="city:Elevation"/> > > | </sequence> > > | </complexType> > > |</element> > > > > Type Reuse > > > > - Declare a type for reuse: > > > > |<simpleType name="Elevation" base="integer"> > > | <minInclusive value="-1290"/> > > | <maxInclusive value="29028"/> > > |</simpleType> > > > > - Reusing the type: > > > > |<element name="Boston"> > > | <complexType> > > | <sequence> > > | <element name="Elevation" type="city:Elevation"/> > > | </sequence> > > | </complexType> > > |</element> > > > > Which is preferred - declare Elevation as an element and reuse that > > element, or declare Elevation as a type and reuse the type? Here are > > some things to consider: > > > > - Declaring it as an element will allow equivClasses to be created, thus > > enabling the Elevation element to be substituted by members of the > > equivClass. > > - Declaring it as a type will allow derived types to be created, thus > > enabling the Elevation type to be substituted by derived types. > > - Someone once said that XML Schemas is a "type-based system". I am not > > sure what that means, but perhaps it means that the idea behind XML > > Schemas is to reuse types? > > - In programming languages types are the items typically that get > > reused. Does that apply to XML Schemas, or not? > > > > What are your thoughts on type versus element reuse? What guidelines > > would you recommend to someone struggling to decide whether he/she > > should make an item as an element or as a type? > > > > [2] Local versus Global: when should an element or type be declared > > globally versus when should it be nested within something else (i.e., > > local)? Again, let's take an example: > > > > - Everything Global > > > > |<element name="Book" type="cat:Listing"/> > > |<complexType name= "Listing"> > > | <sequence> > > | <element ref="cat:Title"/> > > | <element ref="cat:Author"/> > > | </sequence> > > |</complexType> > > |<element name="Title" type="string"/> > > |<element name="Author" type="string"/> > > > > - Everything Local > > > > |<element name="Book"> > > | <complexType> > > | <sequence> > > | <element name="Title" type="string"/> > > | <element name="Author" type="string"/> > > | </sequence> > > | </complexType> > > |</element> > > > > What guidance can we provide a schema designer in deciding whether or > > not to "hide" a type or element (by nesting it)? Someone once asked me > > when it would be desirable to make an element or type local. I was hard > > pressed to think of a situation. Thus, I was not able to provide > > guidance on when to use elements/types locally. It is easy to see the > > benefit of declaring elements/types globally - they can be reused, not > > only within a schema but also across schemas. It is not so easy for me > > to see the benefit of hiding elements/types. Can someone provide > > guidance on this issue? Does the OO encapsulation principle apply to > > XML Schemas? If so, why? If not, why not? > > > > [3] elementFormDefault - to qualify or not to qualify: > > elementFormDefault is an attribute of <schema>. It is used to dictate > > what elements are to be namespace-qualified in instance documents: a > > value of "qualified" means that everything is namespace-qualified in > > the instance document, whereas a value of "unqualified" means that only > > global items are namespace-qualified. Personally, I find that for > > simplicity it is easiest to use "qualified" and then in the instance > > document use a default namespace declaration. It is not real clear to > > me the advantages of using "unqualified". In other words, I would not > > be able to provide good guidance on when to use "unqualified". If > > someone asked you to list the scenarios when it would be desirable to > > use "unqualified" what guidance would you give? > > > > [4] Evolvability/versioning: in today's rapidly changing marketplace, > > there is no question that schemas will need to change (evolve). What > > guidance do you provide a schema designer in engineering his/her schema > > to support change? When a schema is changed, how do you indicate that > > it is a new version - with a new namespace? > > > > I have thought quite a bit about schema evolution. At the end of this > > message I expound quite a bit this subject. > > > > As for versioning, that is something that I would be hard pressed to > > provide guidance upon. When a new version of a schema is created, what > > techniques should one use to signify the new version? One idea is to > > create a new namespace for the new version. Another idea is to simply > > change the version attribute on <schema>. How would you indicate a new > > version? > > > > [5] One namespace versus many namespaces (import versus include): I > > think that in a typical project many schemas will be created. A > > question will then arise, "shall we define one namespace for all the > > schemas or shall we create a different namespace for each schema?" What > > are the tradeoffs in creating multiple namespaces versus a single > > namespace? What guidance would you give someone starting on a project > > that will create multiple namespaces - create a namespace for each > > schema or one umbrella namespace? > > > > [6] Capturing semantics of elements and types: a schema creates > > elements, defines the relationships between the elements, and defines > > the datatypes of the elements. However, that by itself doesn't define > > the semantics of the elements. For example, consider this element > > declaration: > > > > <element name= "jdkdsfjkds"> > > <simpleType base= "string"> > > <pattern value= "[a-zA-Z]+\d"/> > > </simpleType> > > </element> > > > > Does this tell you the meaning of "jdkdsfjkds"? Probably not. > > Something more is needed. What guidelines would you give someone > > wishing to document the semantics of the items created in a schema? > > > > Here are some guidelines that Mary Pulvermacher sent to me: > > > > "Our current thinking is to capture as much of the semantics as possible > > in the XML schema itself. We plan to do this by using the XML Schema > > provided annotation element and having a convention that every element > > or attribute has an annotation that provides information on the > > meaning. Of course this is not perfect but it does carry some > > advantages. > > > > - The XML schema will capture the data structure, meta-data and > > relationships between the elements. > > - Use of strong typing will capture much of the data content. > > - The annotations can capture definitions and other explanatory > > information > > - The structure of the "definitions" will always be consistent with the > > structure used in the schema since they are linked. > > - Since the schema itself is an XML document, we can use XSL to > > transform this information into a format suitable for human > > consumption." > > > > Do you have any other thoughts on capturing the semantics of elements > > and types created by a schema? What guidance would you give to someone > > wishing to capture the semantics of the elements and types? > > -------------------------------------------------------------------- > > > > Some thoughts on enabling schema evolution (expansion of [4] above) > > > > In today's rapidly changing market static schemas will be less > > commonplace, as the market pushes schemas to quickly support new > > capabilities. For example, consider the cellphone industry. Clearly, > > this is a rapidly evolving market. Any schema that the cellphone > > community creates will soon become obsolete as hardware/software changes > > extend the cellphone capabilities. For the cellphone community rapid > > evolution of a cellphone schema is not just a nicety, the market demands > > it! > > > > Suppose that the cellphone community gets together and creates a schema, > > cellphone.xsd. Imagine that every week NOKIA sends out to the various > > vendors an instance document (conforming to cellphone.xsd), detailing > > its current product set. Now suppose that a few months after > > cellphone.xsd is agreed upon NOKIA makes some breakthroughs in their > > cellphones - they create new memory, call, and display features, none of > > which are supported by cellphone.xsd. To gain a market advantage NOKIA > > will want to get information about these new capabilities to its vendors > > ASAP. Further, they will have little motivation to wait for the next > > meeting of the cellphone community to consider upgrades to > > cellphone.xsd. They need results NOW. How does open content help? > > That is described next. > > > > Suppose that the cellphone schema is declared "open". Immediately NOKIA > > can extend its instance documents to incorporate data about the new > > features. How does this change impact the vendor applications that > > receive the instance documents? The answer is - not at all. In the > > worst case, the vendor's application will simply skip over the new > > elements. More likely, however, the vendors are showing > > the cellphone features in a list box and these new features will be > > automatically captured with the other features. Let's stop and think > > about what has been just described ? Without modifying the cellphone > > schema and without touching the vendor's applications, information about > > the new NOKIA features has been instantly disseminated to the > > marketplace! Open content in the cellphone schema is the enabler for > > this rapid dissemination. > > > > Clearly some types of instance document extensions may require > > modification to the vendor's applications. Recognize, however, that > > thevendors are free to upgrade their applications in their own time. > > The applications do not need to be upgraded before changes can be > > introduced into instance documents. At the very worst, the vendor's > > applications will simply skip over the extensions. And, of course, > > those vendors do not need to upgrade in lock-step > > > > To wrap up this example ? suppose that several months later the > > cellphone community reconvenes to discuss enhancements to the schema. > > The new features that NOKIA first introduced into the marketplace are > > then officially added into the schema. Thus completes the cycle. > > Changes to the instance documents have driven the evolution of the > > schema. >
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|