Re: [Summary, VERSION #2] Media type (MIME) of XML in MSWord?
"xsl" referred to should of course be "xls" for excel... Rick Marshall wrote: > Hi Roger > > This is messy in the real world and I applaud your attempt to make > sense of it - but I despair of you being successful. > > Note that this is very system dependent and anyone trying to make > sense of it either goes gray or loses their hair. I have made some > more notes for you.... > > Costello, Roger L. wrote: > >> Hi Folks, >> >> Many, many thanks for clearing up my mistakes. I have revised the >> summary based upon your comments. Please let me know of any remaining >> errors. /Roger >> >> *A Summary of XML and Media Types (MIME)* >> >> *What is XML’s MIME Type?* >> >> At this URL is a list of the 350 different MIME types: >> >> http://www.iana.org/assignments/media-types/ >> >> In this list you will see two different MIME types for XML: >> >> *application/xml* >> >> * text/xml* >> >> The later MIME type (*text/xml*) has been deprecated. Thus, the >> official MIME type for XML is: >> >> *application/xml* >> >> Note the format for expressing MIME types – it contains two parts, >> separated by a slash: >> >> / *type*/*//subtype/* >> >> *What is Purpose of a MIME Type?* >> >> * * >> >> Suppose a browser, Web server, or other application is presented with >> a resource (document). The purpose of a MIME type is to give >> information to the browser, Web server, or other application about >> the format of the data contained within the resource. >> >> * * >> >> *Where Does a MIME Type Come From?* >> >> * * >> >> MIME types are metadata. A MIME type is not stored within a resource. >> It is not stored as a property of a resource. Heuristics are used for >> determining the MIME type of a resource. In other words, a system >> “guesses” what the MIME type is. >> > In the case of browsers they are told what the mime type is by the > http header. However in the case of one popular browser it chooses to > ignore the mime type in the http header and instead use the extension > of the dosument being retrieved. Which is really silly when the > "document" is a cgi script. > > So eg if I have a cgi script that is going to return html the > extension is unimportant (for some reason - matbe it's the default), > but if I want to return an xsl document then the cgi GET must be to a > script ending in ".cgi" - go figure. > > Generally in email and browsers the client application is told the > mime type - it does not guess, but as I said one large company tends > to ignore this. To be fair I think they have a security reason as > well, and it does address the issue of what to do if the MIME type > differs from the implied type from the extension. > > To see why this is so you have to understand some history. In Unix > (and now Linux) the dot extension is arbitrary. The file name is the > filename including the dot and by convention some dot extensions mean > things - linke .c, .o, etc. Most importantly executable files were > indicated by an executable bit in the i-node. In early microcomputer > os (file managers?) the dot was note stored and a simpler mechanism > was used to determine what to do with a file. executable binary files > ended in exe, batch files in bat etc. the file name was a name (fred) > plus a file type (the bit after the dot) (exe). The dot was > confusingly added when displaying the name, but not by all early os > (or even early dos for that matter). > > Now, here's the tricky bit - when MIME types came along they were > necessary for accurate determination of file types in typeless systems > like unix and the web, but superfluous to types file systems like the > microsoft products, and hence the conflict. I suspect MS can't fix > this without a fundamental change to their systems and for unix/linux > - well the system was designed to suit them so why should they change? > > So the importance of mime types is a function of your universe :) > >> On Windows the MIME type is guessed by using the file extension. In >> the Window’s Registry is a mapping from file extension to MIME type. >> >> Examples: >> >> - If a file ends with the extension .txt then the MIME type of the >> file is guessed to be *text/plain*. >> >> - If a file ends with the extension .doc then the MIME type of the >> file is guessed to be *application/msword*. >> >> - If a file ends with the extension .xml then the MIME type of the >> file is guessed to be *application/xml*. >> >> - If a file ends with the extension .zip then the MIME type of the >> file is guessed to be *application/zip*. >> >> *Can a MIME Type be Wrong?* >> >> * * >> >> Yes! You could create a *Word* document and deliberately give the >> document a false extension, such as “.xml”. Windows will then guess >> that the MIME type for the document is *application/xml*, which is >> clearly incorrect. >> >> *Why is MIME Type Important?* >> >> * * >> >> It would appear that MIME types are redundant. After all a file >> extension can tell you what kind of data a file contains, right? >> True. However, when you send a file across the Web, a file looses its >> file extension, only the contents are sent, not the filename (or file >> extension). This is where MIME becomes important. When data is sent >> across the Web, it is sent as the payload of an HTTP message. In the >> HTTP header is a field called Content-type, and the value of this >> field is a MIME type, /e.g.,/ >> >> Content-type: *application/xml* >> >> Thus, when a Web server receives data it examines the Content-type >> header field to determine the type of data that is in the payload. >> >> *What is an XML Document?* >> >> Take this simple XML: >> >> *<?xml version="1.0"?>* >> >> *<root>* >> >> * Blah* >> >> *</root>* >> >> and put it into *Word* and give the document the extension “.xml”. As >> we’ve seen above the MIME type is: >> >> *application/xml* >> >> However, it is not an XML document. It is a *Word* document. >> >> Conversely, take the same XML and put it into *Notepad* and give the >> document the extension “.txt”. As we’ve seen the MIME type is: >> >> *text/plain * >> >> Yet, it is an XML document. >> >> So, what is an XML document? Answer: an XML document is one that has >> an XML declaration as the first thing in the file. If you open the >> above *Word* document you won’t find an XML declaration as the first >> thing. If you open the above *Notepad* document you will find an XML >> declaration as the first thing. >> >> *Further Information* >> >> Elliotte Rusty Harold has written an excellent article on this subject: >> >> http://www-128.ibm.com/developerworks/xml/library/x-mxd2.html >> >> *Acknowledgements* >> >> I would like to gratefully acknowledge the excellent inputs from >> these people: >> >> Mitch Amiano >> >> Rick Jelliffe >> >> Amelia Lewis >> >> Rick Marshall >> >> Dave Pawson >> >> Bryan Rasmussen >> >> Henri Sivonen >> >> Nathan Young >> >> > > > > > !DSPAM:448f4f44206747955413084! > >----------------------------------------------------------------- >The xml-dev list is sponsored by XML.org <http://www.xml.org>, an >initiative of OASIS <http://www.oasis-open.org> > >The list archives are at http://lists.xml.org/archives/xml-dev/ > >To subscribe or unsubscribe from this list use the subscription >manager: <http://www.oasis-open.org/mlmanage/index.php> > >!DSPAM:448f4f44206747955413084! > >
begin:vcard fn:Rick Marshall n:Marshall;Rick email;internet:rjm@z... tel;cell:+61 411 287 530 x-mozilla-html:TRUE version:2.1 end:vcard
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format