[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: XML Schemas: Best Practices
Hi Folks, I am delighted to see the responses to my last message. Clearly people are thinking about this issue and have strong feelings about hiding namespace complexities in the schema versus making namespaces explicit in instance documents. This is good! Now let's see if we can distill out some general guidelines on when to hide and when to make explicit. Based upon some of the responses I can see that I did not do a very satisfactory job in motivating when you would want to hide the namespace complexities. So let's quickly address that again, and then move on to guidelines for when it is desirable to make namespaces explicit in instance documents. Recall the camera example that was presented. By designing the schema so that body, lens, and manual_adaptor are children of camera (i.e., local elements), and by setting elementFormDefault="unqualified" we enable the creation of a class of instance documents that are pretty straightforward to read and write. An example of one instance document was presented: <?xml version="1.0"?> <my:camera xmlns:my=http://www.camera.org ? > <body>Ergonomically designed casing for easy handling</body> <lens>300mm zoom, 1.2 f-stop</lens> <manual_adaptor>1/10,000 sec to 100 sec</manual_adaptor> <my:camera> Recall that the schema imported the declaration of the body element from the nokia schema, the lens element from the olympus schema, and the manual_adaptor element from the pentex schema. Looking at the instance document above one would never realize this. Such complexities are localized to the schema. Thus, we say that the schema has been designed in such a fashion that its complexities are "hidden" from the instance document. Several people responded to this design approach arguing that they believe that it is good and perhaps necessary to qualify body, lens, and manual_adaptor. Below I show the instance document with all elements qualified with a namespace: <?xml version="1.0"?> <my:camera xmlns:my="http://www.camera.org" xmlns:nikon="http://www.nokia.com" xmlns:olympia="http://www.olympia.com" xmlns:pentex=http://www.pentex.com ?> <nikon:body>Ergonomically designed casing for easy handling</nikon:body> <olympia:lens>300mm zoom, 1.2 f-stop</olympia:lens> <pentex:manual_adaptor>1/10,000 sec to 100 sec</pentex:manual_adaptor> <my:camera> This instance document makes explicit that the body element comes from the nikon namespace, the lens element comes from the olympia namespace, and the manual_adaptor element comes from the pentex namespace. Thus, we come to two fundamental questions: [1] When does it make sense to design a schema to hide the namespace complexities from instance documents? [2] When does it make sense to design a schema to force instance document to make explicit the namespaces of its elements? The later question will be answered in the next section. For now, let's try to characterize the systems for which it makes sense to hide the namespace complexities in the schema. As I compare the two versions of the instance documents above the first thing that strikes me is the difference in readability. The first version is much easier to read. The namespaces in the second version - both the namespace declarations and the qualifiers on each element - are very confusing to an average fellow like myself. So, I come to the first characteristic: "For systems where readability is of utmost importance design the schema to hide the namespace complexities." I can well imagine writing an application to process the camera instance document such that it (the application) does not care what namespace the body element comes from, what namespace the lens element comes from, or what namespace the manual_adaptor element comes from. Such complexities are irrelevant to the application. The application just cares that the camera element contains a body element with the proper type of data, a lens element with the proper type data, and a manual_adaptor element with the proper type data. Knowledge of the namespaces that the body, lens, manual_adaptor elements belong to provides no additional information to the application. At the very best, the namespaces are a distraction to the application. If at some point the application does find it necessary to know what namespace an element is associated with then it will simply look it up in the schema. This brings me to the second characteristic: "For systems where knowledge of the namespaces of the elements provide no additional information design the schema to hide the namespace complexities." Those are the two characteristics that I see. Do you see any further characterizing features? Before moving on to when it makes sense to make the namespaces explicit in instance documents, I would like to pause and address Richard Lanyon's concern. Richard's concern is (paraphrasing): "Okay Roger, let's suppose that it makes sense to localize the complexities to the schema. An author of an instance document will still have to read the schema, and understand it, to write the instance document. Correct? How have we hidden the complexities of the schema?" Let me see if I can address this concern satisfactorily: [1] An instance document is written once but processed by many systems (write once, read many). All those systems which process the document are shielded from the complexities of the schema. [2] In the not-too-distant future there will be tools that read schema and provide a template for the instance document author to fill in. The tool will understand the schema and shield the author from needing to understand the schema. I hope that answers your concern satisfactorily Richard. If anyone else has anything to add to this please join in. Now let's move on to characterizing those systems for which it makes sense to design a schema to force instance document to make explicit the namespaces of its elements. First recall the techniques a schema uses to force instance documents to expose the namespaces of its elements. [1] Use elementFormDefault="qualified" to Force the Use of Namespace Qualifiers Len Bullard sketched out a schema for a 3D rendering system. Let me refer to that as the "video-game" schema. Let's see how to design that schema so that it forces instance documents to use namespace qualifiers on its elements: <?xml version="1.0"?> <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.video-game.org " elementFormDefault="qualified" xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance" xsi:schemaLocation= "http://www.w3.org/1999/XMLSchema http://www.w3.org/1999/XMLSchema.xsd" xmlns:design-works="http://www.design-works.com" xmlns:disney="http://www.disney.com" xmlns:mci=http://www.mci.com> <import namespace= http://www.design-works.com schemaLocation= "DesignWorks.xsd"/> <import namespace= http://www.disney.com schemaLocation= "Disney.xsd"/> <import namespace= http://www.mci.com schemaLocation= "MCI.xsd"/> <element name="video-game"> <complexType> <sequence> <element ref="design-works:geometry" minOccurs="1" maxOccurs="1"/> <element ref="design-works:lighting" minOccurs="1" maxOccurs="1"/> <element ref="disney:character" minOccurs="1" maxOccurs="1"/> <element ref="mci:voice" minOccurs="1" maxOccurs="1"/> </sequence> </complexType> </element> </schema> The most important part of this schema is that elementFormDefault= "qualified". That attribute forces instance documents to qualify all elements: <?xml version="1.0"?> <video-game xmlns="http://www.video-game.org" xmlns:design-works="http://www.design-works.com" xmlns:disney="http://www.disney.com" xmlns:mci="http://www.mci.com" xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance" xsi:schemaLocation="http://www.video-game.org VideoGame.xsd"> <design-works:geometry> 24m x 71m</design-works:geometry> <disney:lighting>Shadow in foreground, light in back</disney:lighting> <mci:voice>Digitized voice</mci:voice> <video-game> [2] Declare Elements Globally to Force the Use of Namespace Qualifiers Global elements must be qualified in instance documents regardless of whether elementFormDefault has the value of "qualified" or "unqualified". Thus, we could reorganize the above schema to make all the elements global. [Interestingly, for the video-game schema I don't see how to make geometry, lighting, and voice global. Any thoughts?] Now it is time to answer the question: what characterizes systems for which it makes sense to design the schema so that instance documents are forced to display the namespaces for each element? One quick answer is: "For systems where knowledge of the namespaces DOES provide additional information design the schema to force exposure of namespaces in instance documents." However, this leaves me a bit empty. When does "knowledge of the namespaces provide additional information"? That is the question which must be answered. Suppose that an application will process the geometry element differently if it's associated with design-works versus some other namespace. I could imagine for marketing purposes such preferential treatment may occur. When else? What are your thoughts on this? Clearly namespaces are great for dealing with name collisions. In the video-game example I don't have multiple elements with the same name. If I did, however, and they came from different namespaces then it is easy to imagine that we would want to design the schema to force instance documents to expose the namespaces so that applications could easily distinguish the elements. Let's try rephrasing the above characterization given this new information: "For systems where knowledge of the namespaces does provide additional information design the schema to force exposure of namespaces in instance documents. Knowledge of namespaces may enable applications with: - namespace-dependent processing, and - distinguishing between elements with the same name." Okay, that's enough for now. Your turn. What are your thoughts on any of this? What guidelines would you provide someone who asks you: "Should I design my schema to hide the namespace complexities, or should I design it to force instance documents to expose the namespaces of its elements?" /Roger
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|