[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Dealing with namespaces (Was Re: "Binary XML" proposals)
Ken MacLeod wrote: > Joe English <jenglish@f...> writes: > > Al Snell wrote: > > > [using a string table for element and attribute names] > > > > That's the approach I used in Cost; it works well. [...] > > > > This starts to break down when you throw namespaces into the mix > > though, since element and attribute names are no longer simple, > > atomic values. [...] > > > > I haven't yet seen or thought up a fully satisfactory solution to > > this problem... > > In Orchard[1], I use the tuple of (URI,LocalName) for element and > attribute names, instead of the QName, and it works great. Another approach I've been playing with is to normalize all QNames on input so that the same prefix is always used for the same URI. (If two prefixes are bound to the same URI, only the first one is used internally; if a declaration uses a prefix that's previously been bound to a different URI, the normalizer generates a new prefix.) Application programs can also declare prefix mappings. This way, a program that wants to process elements in the {http://www.foo.com/} namespace can call xmlns::declare "foo" "http://www.foo.com" at the beginning of the program. Subsequently, QNames with prefixes bound to the {http://www.foo.com} namespace URI will be rewritten to use the 'foo:' prefix no matter what prefix was used in the input document. Then the application can treat element and attribute names atomically as "foo:bar" just like in the pre-namespace days. This works OK for many (most?) applications, but troubles arise with architectures that use QNames inside attribute values and element content. For example XSD, and anything that uses XPath. To support architectures like this, it's also necessary to make the QName normalization routine available to the application. In a SAX-like interface it's sufficient to provide access to the "current" namespace environment; in a DOM-like interface, QName normalization also depends on a context node. Then there's the matter of (re)serialization. If a program reads in a document, modifies it some, then writes it back out again, is it necessary to preserve the original prefixes or is it safe to use the "normalized" ones? In the general case I think it's necessary to preserve the original prefixes -- the architecture might use QNames-in-content in places that the application doesn't know about, or it may just be more trouble than its worth for the app to normalize all the QNames-in-content. Preserving the namespace environment is subtle and difficult to get right; the DOM and XSLT specs have to deal with this issue, and the solution doesn't look pretty. Then there's RDF, which interprets unprefixed attribute names as if they "belong to" the namespace of their parent element, whereas most other architectures -- including the Infoset and "Namespaces in XML" TR -- interpret unprefixed attributes as "global", i.e., they don't have a Namespace URI property at all. If the architecture is mostly Infoset-compliant but allows RDF data islands to be mixed in (which seems like a common practice), how can you tell when to apply RDF semantics and when not to? This one currently has me stumped. Maybe for RDF it's best to translate all QNames into URIs and work with those instead? I'd like to have an API that makes namespace issues mostly transparent. XSLT comes pretty close to achieving this. IMO the SAX and DOM APIs miss the mark entirely; they don't help much at all. In conclusion: namespaces are a pain in the ass. --Joe English jenglish@f...
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|