[Home] [By Thread] [By Date] [Recent Entries]
> It has nothing to do with schemas or schema languages. There is no data > model for namespaces other than "set of names". XSLT uses them as a way > of having literal elements. WSDL uses them as a way of defining > WSDL-specific extensions. In theory, XHTML could use them as a way of > adding new renderable elements. In each circumstance you would combine > the schemas in a different way. > > This wide variety of usage patterns came about because we were told that > "namespaces don't mean anything, they just name things." But then XML > Schema came along, and it did treat them as if they were somewhat > meaningful, because each schema had a targetNamespace. That's okay, > because you can combine schemas so it is primarily just a file layout > issue. Now RDDL comes along and tries to make it easy to find "the" > schema for a namespace. Or "the" XSLT for a namespace. But there is no > "right" schema for a namespace. The right one depends on the document > type (i.e. what the document means, all together) and what the recipient > is trying to do with the document. > > > So of RDDL's three goals, #3 doesn't seem practically achievable until > and unless we define a data model for namespaces and define the > semantics of namespace combination. If we do not do that, then HTML is > probably a sufficient referent. I totally agree with you. But what is interesting me here is your point about the fact that XML Schema treats namespaces as if they were meaningful. That's what I answered to Simon St.Laurent in a previous mail (I can't find it in the archives, it seems that the archives have lost a few posts...). Simon was against defining document types, and wanted to use namespaces differently so that they could be assimilated to document types. We now are at a point where we have the choice between two scenarii : 1) Namespaces are just sets of names with no additional semantics, they are 50% of QNames and that's all. In this scenario, schemas can span multiple namespaces because the semantics of a document are defined by its schema, not the namespaces it uses. WARNING: like for all of by previous posts, 'schema' means a set of constraints on the structure of a document. A schema can be written as a DTD, an XML Schema, a RELAX NG schema, etc. In this world, we have to define the concept of 'document type' so that we may associate meta-data (schemas, stylesheets, code, etc.) to a set of documents that share the same semantics (the same abstract schema). In this world, associating schemas to namespaces is nonsense, RDDL is still useful as a namespace description format and XML Schema is severely impaired (because you can't write the XML Schema of a document which mixes namespaces, like RDDL, WAP 2.0, a SOAP request, etc.). 2) Namespaces are meaningful. They are not only 50% of the document. Some programs can implement algorithms based on the namespaces only, and still produce interesting results. For example, a browser, when encoutering an unknown namespace, could delegate the rendering of the subtree beginning at the foreign element to a plugin that could be dynamically downloaded. This is possible because an abstract schema is bound to the namespace, so that it is possible to write code that depend only on the namespace and its intrinsinc schema. A renderer plugin could process the subtree because it would know the elements and structure of elements that the namespace schema enforces. The enclosing schema would not have any means of changing this schema, so the plugin would never find unexpected data. [ This would not be possible in scenario 1, where the document structure is totally unrelated to the namespaces, so it would be impossible to associate some code to a namespace and expect any namespace-inherent structure. There could be situations in which some elements would be recognized, but found in a totally different context than the plugin could expect, thus causing failures or worse, incorrect results. ] Note that this model could be easily extended to support multiple schema, depending on the first element encountered. In that case, a namespace would have one abstract schema per element that can stand-alone or be embedded into a foreign document. In this scenario, XML Schema would be at ease, not limited by its unique targetNamespace. Indeed, schemas would never span multiple namespaces, to ensure the proper operation of all the code that depends on namespaces. However, schemas could be extensible and integrate multiple namespaces by delegating the validation of foreign namespaces to their appropriate schema (based on the element encountered). We would have to find a way to standardize this delegation mechanism. Moreover, as we cannot expect all schemas for all namespaces to be written in XML Schema, this delegation mechanism should support cross-schema-language delegation. This would result in the "namespace-insulation" of schemas, the schemas being assigned a single targetNamespace, yet have interfaces with other namespaces through delegation mechanisms. There is another example of usage of namespaces, maybe more interesting than plugins for rendering given the current news about web services. A namespace-based content validation, dispatching and processing would radically change the way web services could be implemented. We could have general validation for any SOAP request, for example, each part of the SOAP message being validated by an appropriate message : the SOAP envelope by the SOAP schema, the request by the schema associated to the namespace and name of its root element, etc. Of course, in this scenario, typing information would be closely related to namespaces, so RDDL would be the perfect place to list all the schemas of a namespace, not forgetting to specify the root elements of each schema. Note that this delegation mechanism and the "namespace-insulation" property of scenario 2 has an equivalent in scenario 1. This would be implemented by having the schemas reference other document types defined in other schemas. Likewise, for better integration, we would have to design a delegation mechanism that would allow schemas in one language to ask for the validation of foreign document types with schemas in other languages. Now let's try to compare the two scenarii. - Scenario 1 and 2 seem equally powerful. - They both require some work on the schema validation process, to design this cross-schema-language validation mechanism. - Scenario 2 has the advantage of not requiring to throw away the work done on XML Schema. Though XML Schema is not the only schema language available, far from it, a lot of technologies have standardized on top of it. Would it be wise to throw it away, like scenario 1 seems to require ? - Scenario 2 is also the simplest way to go to handle the nasty habit of some specs to add supposedly "out-of-band" special attributes to documents. XML Schema, for example, adds an xsi:schemaLocation attribute to the root element of each document. In scenario 1, we could either ignore such an additional attribute, leaving a hole open in the schema allowing any kind of foreign attribute to be added to the document without breaking its validity, or we could specify this attribute in the abstract schema of the corresponding document type, which would not be satisying either (it would not scale to other specifications like XLink). In scenario 2, more than root elements, we could define root patterns as keys for schemas. Instead of saying 'for each foo:bar, the schema must be so and so", we could say "for each element that has an xlink:href attribute, the schema must be so and so". This way, we could validate 'squatters' attributes without touching the original document schema. - On the other hand, scenario 2 will break some current schemas that span multiple namespaces. DTDs like the DTD for RDDL, WAP 2.0 et al. would not be accepted as proper schemas since they don't respect the namespace insulation principle. With luck, we could rewrite those schemas to respect the namespace insulation principle. If we cannot, then we're stuck : the schema is broken and can't be properly used in the new XML world that was created when choosing scenario 2. But what was broke can be rebuilt differently... - Finally, scenario 1 require some more work on the concept of document type, whereas scenario 2 reuses and leverage the concept of namespace. Defining the concept of document type is not necessarely difficult, in fact I suspect it is a set of schemas whereas namespaces are a set of names, so we could have document types URI, and so on. But this would be a new object to define, on top of namespaces. So, which scenario should we choose ? It is quite a surprise to me that after battling so fiercely against the 'namespace == document type' belief I see so much advantage to it... Let me say things clearly and try to sum them up. In the current state of XML specifications and standards : 1) namespace != document type, except maybe for XML Schema which has a different belief 2) RDDL cannot be used to obtain schemas for a given XML document, so we have to create a document type 3) An alternative to the document type creation is to play a what-if game about 'namespace==document type'. Scenario 1 is 'namespace!=document type, so let's create document types'. Scenario 2 is 'namespace==document type, so what is XML becoming ?'. 3) Scenario 1 and 2 are equally powerful, i.e. what can be done in scenario 1 can be done in scenario 2. This is what I feel, I don't have any proof. Ideas and protests are welcome here. 4) However, the price to pay for scenario 1 and scenario 2 seems different. Scenario 1 will save some schema that would not be rewritable in the namespace insulation constraints of scenario 2, but adds a new concept and prohibit the generalised use of XML Schema. Scenario 2 reuses the concept of namespaces, allows the use of XML Schema and could handle "parasit attributes" quite nicely, but will force a massive rewrite of all schemas that contains a mix of namespaces, as well as force everybody to think twice about their usage of namespaces (which would no longer be only 50% of a QName). Ideas and remarks are welcome. Do you at least agree that we'll need to choose between those two scenarios and standardize on this choice, instead of having some key technologies assume scenario 2 (XML Schema, RDDL) and others assume scenario 1 (XML Namespaces, XHTML Modularisation, etc.) ? Best regards, Nicolas Lehuen http://nicolas.lehuen.com/
|

Cart



