|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Why you should avoid Notation Declarations (by Kohsuke
On Wed, Feb 23, 2005 at 07:53:48AM -0500, Elliotte Harold wrote: > Henry S. Thompson wrote: > >You have a place in your XML document type for (e.g. hex-encoded) > >BLOBs to be included or referenced. Each BLOB in turn should be an > >encoding of something using one of a delimited set of more-or-less > >standard encodings. You want to allow or require document authors to > >signal which encoding they're using. > > How do notations improve on, for example, <data mime:type="image/jpeg" > href="somefile.jpg"/>? Normally I like additional layers of indirection, > but the NOTATION indirection has never seemed all that useful. What am I > missing? You're not missing very much. Worse, in the Web environment for which we tried to develop XML, it's not up to the client, nor to the document author, what formats are available for the representation of a given resource identified by URI. In English :-) that means that if you try and fetch http://www.example.com/images/foo.gif the remote Web server is *perfectly* at liberty to send you a JFIF/JPEG file, a PNG file, or even a text document. It does this based on a combination of the Accept header given by the user agent (e.g. Web browser) and what formats the server has available. And yes, common Web servers in widesperad use do actually do this (e.g. Apache). So a document author has no business *whatsoever* saying what format should be used for an image unless they know exactly how their document will be processed. Which seems to me very much against the spirit of XML. There *is* a case in which NOTATION makes some sense -- it's when the content of an element is in an XML-comaptible text-based format other than XML. You can use a CDATA section but the character gmut must be those Unicode characters permitted to be used within an XML document (or a subset, of course). Hence, you can't just put an image there. W3C Schema Notation goes further, and you can base64-encode the content of an element, to circumvent the character set issue and remove the need to use cdata sections. In this circumstnce, the notation is signalling to the XML processor (post-schema at least) that the content of the element is in a constrained format. Presumably it could unpack and test, and if (for example) the JPEG image was found to be non-conforming with the corresponding ISO document, one can imagine rejecting the document, although I don't know if any processors do that automatically beneath the Schema level, nor if Schema explicitly sanctions this, nor if I'd always want such behaviour! Personally I'd be 100% happy about removing DTD-style notations, which are pretty much compleetly broken, and I don't have a strong opinion on Schema notations except to say that I think that since not aligned with DTD notations they could perhaps more usefully have been given a different name, e.g. EncodedContent, and should still be restricted to element or attribute content... external unparsed entities, and the equivalent with XInclude, have no business (in general) trying to pontificate about what file format is to be used when in general they won't be correct. A hint that's useful in a closed environment is no more than a hint, and is a higher-level convention. Liam -- Liam Quin, W3C XML Activity Lead, http://www.w3.org/People/Quin/ http://www.holoweb.net/~liam/
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||






