|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] [Fwd: ATTN: Please comment on XHTML (before it's too late)]
This is from Eliot Kimber (<eliot@i...>): Oren Ben-Kiki wrote: > > It's no more of a "threat" than the "threat" of people creating their own > > namespaces for any purpose, which is indeed the entire idea behind > > namespaces. > > That depends on what you feel "the entire idea of namespaces" to be. To me, > the main idea is to allow applications to distinguish between tags with > different semantics. By qualifying a tag with a namespace, the document > writer essentially informs the application that the semantics of the tag is > that associated with the namespace. The fact that this semantics is defined > outside the XML standards is besides the point. No--namespaces have *absolutely nothing* to do with semantics. The only possible purpose of namespaces is to disambiguate names. *ANY ASSUMPTIONS MADE ABOUT SEMANTICS BASED ONLY ON NAME ARE UNJUSTIFIED AND UNSUPPORTABLE*. For example, I may have many names in different name spaces that map to the same semantic--there is no way to describe this using name spaces alone. A given name may have different semantics based on its context (either its structural context or its use context). Because the name space mechanism provides neither a formal binding between name space names and semantic definitions nor name space names to vocabulary definition bindings, it is impossible to make *ANY* reliable inferences about the meanings of names in an XML document. That is, you either know what the semantics of a particular name are or you do not. The binding is always to the entire name, not just to the namespace prefix. Making assumptions based on the prefix alone is at best a guess. First, unless you personally defined the name space, you have no way of knowing if a given name is in fact a "valid" name in the name space because there is no definition of how one defines the set of names in a name space. If I find the name "myspace:foo" there is no standardized way to validate that "foo" is a member of the name space "myspace" because there is no standardized definition mechanism for the vocabulary of which "foo" may or may not be a member. Thus, all the name-space prefix is doing is ensuring that "myname:foo" will not collide with any other name that ends in "foo". AND THAT IS ALL. Namespaces were intended to solve the problem of *name collision*, which they do. But they explicitly do not have anything to do with binding names to semantics and therefore you are *never* justified in infering semantics from namespace use. You may know that a given name space *has been bound* to a given set of semantics, but that's different. This knowledge of the binding comes from some mechanism outside the namespace mechanism itself [see below]. By this reasoning, it doesn't matter how many different name-space prefixes XHTML uses because *none of them* give you a way to know that what you are processing is in fact an XHTML document (or XHTML-specific element). Rather, the binding between documents and their *governing semantic definitions* (e.g., schemas, architectures, etc.) must be provided by some other mechanism. In the absence of a generalized mechanism for doing this binding, it can only be done in documentation of the semantics. Another thing to keep in mind is that there is not necessarily a one-to-one binding of schemas to name spaces (or name spaces to schemas). The same abstract types could be mapped to many different names (short and long, English and French, domain A and domain B, enterprise 1 and enterprise 2, etc.). The same names could be mapped to *different* semantics in different contexts (the element type "myns:employee" maps to both "person" in schema A and "bold" in schema B). Assumptions about name-to-semantic bindings seem to be based on the idea that there is always an exact one-to-one binding of names to semantics. But of course this is not always (or even usually) the case. For example, I would probably want to provide different name lengths or national language bindings for the same abstract element types. Thus, I would have one overall abstract schema, "MyElementTypes", and several name spaces that provide specialized names for elements derived from the abstract types. Thus, while name spaces are mildly useful for disambiguating names, they can do nothing, by themselves, to provide a reliable or complete binding of names to semantics and therefore provide no basis for infering semantics based on name space alone. > So according to this idea, applications are built under the assumption that > 'my:foo' and 'your:foo' are completely different, with nothing whatsoever in > common. The fact they both have the name 'foo' is considered accidental. > _That's_ the whole idea. But this assumption is completely unfounded--"my:foo" and "your:foo" could in fact be mapped to exactly the same semantic--there is no way to know from the namespace usage itself and nothing in the namespace spec justifies the single mapping assumption. > Providing three different namespaces which have the same semantics would > force application writers to abandon this assumption. In XHTML, > 'traditional:p', 'strict:p' and 'frameset:p' are the same thing. This would > seriously mess XHTML applications up - put another way, it would cause > generic XML applications to fail on XHTML documents. Why would three name spaces cause more failures than one name space? Either you know what the names mean or you don't. In this case, all I have to do is know that the names in all three spaces map to the same base type. Since there's no W3C-defined mechanism for this, the authors of the XHTML spec can define an obvious one: use the base name as the base type. Once I've implemented this mapping in my code, there's no problem (unless someone uses a bogus base type name, which of course there's no way to formally validate in the namespace universe). That is, my code looks like this: def process_element(node): nsp = get_ns_name(node.TagName) gi = get_base_part(node.TagName) if nsp == "XHTML strict ns URN" | nsp == "XHTML traditional ns URN" | nsp == "XHTML frameset ns URN": apply_XHTML_semantics(node) elif nsp == "Some other namespace": apply_someother_semantics(node) else: raise UnknownNamespaceException(node) I don't see where the problem is, unless the concern is the amount of typing one has to do. [But what is looks like to me is that the really have three different *DTDs* (or rather, architectures) for the same base names. If this is in fact the case, then the XHTML authors have inappropriately confused name spaces with DTDs and they should fix that. In fact, I think there are four architectures at work here: the base architecture that defines the types and imposes the minimal structural rules, then three derived architectures, one each for "strict", "traditional", and "frameset", which impose different detailed structural rules on documents. There is no way, using W3C-defined mechanisms alone to define this system today (you can do it with SGML Architectures). This may change when the XML Schema work is finished if (and only if) it satisfies the same requirements for type classification and constraint that SGML Architectures satisfy (ideally it would satisfy more that what SGML Architectures satisfy, but I'll settle for just having the equivalent of architectures).] > For example, consider that a generic XML application must never mix up a > 'commercial:order' with an 'administrative:order', no matter what. You say "must": do you mean that in the absolute "law of nature" sense or the in the "for this example, this is the business rule that applies" sense? If the former, then the use of "must" is entirely unfounded. Maybe commercial:order and "administrative:order" are in fact specializations of a more general type "order" and there are processing contexts in which they *must* be processed in exactly the same way. Without knowing the semantics of all three types, there's no way to know what the business rules are, but in any case, the business rules cannot be inferred from the use or non-use of name spaces. On the > other hand, one would expect that a 'strict:p' element would be > interchangable with a 'traditional:p' element. For example, in an XHTML > editor, I'd expect to be able to cut one and paste it in replacement of > another. That seems like a messy issue, unless I'm missing something. Whether it is meaningful or not to replace one element type with another can only be defined at the schema or application level. The use or non-use of name spaces cannot tell you that. The mess is no different from knowing whether or not "p" and "pre" are interchangable. A name is just an identifier and in the absence of a formal, verifiable binding of names to semantics you cannot make any inferences from the names. The fact that we have a body of knowledge about what we think "p" means is a red herring. The only way to know if "strict:p" is interchangable with "traditional:p" is to read the XHTML docs, because that's the only place the semantics could possibly be defined because that's the only defined mechanism we have at the moment. So there's no mess because you *always* have to read the docs. Now, the docs can say "there is a binding between the names in namespace X and the semantic types defined in this document"--that's fine, but that is not a computer processible statement--it's a directive to programmers and document authors. But since you can't know about this statement unless you've read the docs first, if you see the namespace first and make assumptions about semantic bindings, you are living dangerously at best and may make wildly incorrect or inappropriate inferences at worst. The first thing you must do when you see a new name space is chase down *all semantic documentation* that references that name space to see what the possible semantics are. Of course, this is impossible in the general case *BECAUSE THERE'S NO BINDING FROM NAME SPACE NAMES TO SEMANTIC DEFINITIONS*. Oops. That is, unless you *already know* what semantics are bound to a given name space, you cannot find it out reliably. Here's an experiment: find *the complete list* of semantic bindings for these name spaces: xmlns = "urn:schemas-microsoft-com:xml-data" xmlns:dt = "urn:schemas-microsoft-com:datatypes" xmlns:xa = "www.extensibility.com/schemas/xdr/metaprops.xdr I want documents, formal, machine-readable specifications, etc. such that there can be no argument about what the set of valid bindings is. I believe it is impossible to do. Either the name space declaration must also bind to one or more semantic definitions, or the document must bind to sematic definitions and then bind those definitions to name spaces. With the SGML Architecture mechanism defined in ISO/IEC 10744:1997, you have the first: an architecture use declaration binds to both a semantic definition (the architecture documentation) and a name-space definition (the architectural DTD, which serves to define a vocabulary of element types and attribute names). Local names are bound to architectural names as part of the element type definition. The same local name can be bound to any number of architectures and multiple local names can be bound to a single architectural name. Name colisions from different architectures are handled by using different local names (that is, given two architectures that both define the element type "p", I might use the local names "p1" and "p2", each mapped to the appropriate architectural "p"). I mention architectures merely as an example of an existing, standardized mechanism for solving the name-to-semantic binding problem. I would except the eventual XML Schema mechanism to provide the same sort of mechanism that is at least as complete as the SGML Architecture mechanism and, hopefully, more complete and convenient to use (the SGML Architecture mechanism is limited by the fact that we had to do everything within the constraints of DTD syntax). > > If three namespaces present such an insurmountable problem, perhaps again, > > the current "implementation" of namespaces is at fault. > > The problem is not with the namespaces implementation (or definition, or > design). It is with using them to a different purpose then they were > designed for. Namespaces were designed for exactly one purpose: to lexically disambiguate names within the global name space of URN-identified things. They do that. They do nothing else. Therefore, the XHTML use of name spaces, whatever it is, must be correct. NOTE: I have no opinion on XHTML's use or non-use of multiple name spaces. It is entirely irrelevant to the usability or processibility of XHTML documents. Fundamentally, there is no difference between "strict:p" (or rather "urn:xmlnamespace:XHTML:strict:p") and "urn:xmlnamespace:XHTML:strictp". They are both unique names from which you can infer exactly the same amount of semantic information, which is to say, none. In both cases, I have to know, as an author and programmer, what the element type means and the *only way* to know that is to read the XHTML spec. Once I've read the spec, the names used in documents are irrelevant as long as the mapping is implemented correctly. At best, the use of name spaces can provide a convenient memory jog for remembering what the mapping is. Cheers, Eliot xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||






