[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: [Summary, VERSION #2] Media type (MIME) of XML in MSWord?
"xsl" referred to should of course be "xls" for excel... Rick Marshall wrote: > Hi Roger > > This is messy in the real world and I applaud your attempt to make > sense of it - but I despair of you being successful. > > Note that this is very system dependent and anyone trying to make > sense of it either goes gray or loses their hair. I have made some > more notes for you.... > > Costello, Roger L. wrote: > >> Hi Folks, >> >> Many, many thanks for clearing up my mistakes. I have revised the >> summary based upon your comments. Please let me know of any remaining >> errors. /Roger >> >> *A Summary of XML and Media Types (MIME)* >> >> *What is XML’s MIME Type?* >> >> At this URL is a list of the 350 different MIME types: >> >> http://www.iana.org/assignments/media-types/ >> >> In this list you will see two different MIME types for XML: >> >> *application/xml* >> >> * text/xml* >> >> The later MIME type (*text/xml*) has been deprecated. Thus, the >> official MIME type for XML is: >> >> *application/xml* >> >> Note the format for expressing MIME types – it contains two parts, >> separated by a slash: >> >> / *type*/*//subtype/* >> >> *What is Purpose of a MIME Type?* >> >> * * >> >> Suppose a browser, Web server, or other application is presented with >> a resource (document). The purpose of a MIME type is to give >> information to the browser, Web server, or other application about >> the format of the data contained within the resource. >> >> * * >> >> *Where Does a MIME Type Come From?* >> >> * * >> >> MIME types are metadata. A MIME type is not stored within a resource. >> It is not stored as a property of a resource. Heuristics are used for >> determining the MIME type of a resource. In other words, a system >> “guesses” what the MIME type is. >> > In the case of browsers they are told what the mime type is by the > http header. However in the case of one popular browser it chooses to > ignore the mime type in the http header and instead use the extension > of the dosument being retrieved. Which is really silly when the > "document" is a cgi script. > > So eg if I have a cgi script that is going to return html the > extension is unimportant (for some reason - matbe it's the default), > but if I want to return an xsl document then the cgi GET must be to a > script ending in ".cgi" - go figure. > > Generally in email and browsers the client application is told the > mime type - it does not guess, but as I said one large company tends > to ignore this. To be fair I think they have a security reason as > well, and it does address the issue of what to do if the MIME type > differs from the implied type from the extension. > > To see why this is so you have to understand some history. In Unix > (and now Linux) the dot extension is arbitrary. The file name is the > filename including the dot and by convention some dot extensions mean > things - linke .c, .o, etc. Most importantly executable files were > indicated by an executable bit in the i-node. In early microcomputer > os (file managers?) the dot was note stored and a simpler mechanism > was used to determine what to do with a file. executable binary files > ended in exe, batch files in bat etc. the file name was a name (fred) > plus a file type (the bit after the dot) (exe). The dot was > confusingly added when displaying the name, but not by all early os > (or even early dos for that matter). > > Now, here's the tricky bit - when MIME types came along they were > necessary for accurate determination of file types in typeless systems > like unix and the web, but superfluous to types file systems like the > microsoft products, and hence the conflict. I suspect MS can't fix > this without a fundamental change to their systems and for unix/linux > - well the system was designed to suit them so why should they change? > > So the importance of mime types is a function of your universe :) > >> On Windows the MIME type is guessed by using the file extension. In >> the Window’s Registry is a mapping from file extension to MIME type. >> >> Examples: >> >> - If a file ends with the extension .txt then the MIME type of the >> file is guessed to be *text/plain*. >> >> - If a file ends with the extension .doc then the MIME type of the >> file is guessed to be *application/msword*. >> >> - If a file ends with the extension .xml then the MIME type of the >> file is guessed to be *application/xml*. >> >> - If a file ends with the extension .zip then the MIME type of the >> file is guessed to be *application/zip*. >> >> *Can a MIME Type be Wrong?* >> >> * * >> >> Yes! You could create a *Word* document and deliberately give the >> document a false extension, such as “.xml”. Windows will then guess >> that the MIME type for the document is *application/xml*, which is >> clearly incorrect. >> >> *Why is MIME Type Important?* >> >> * * >> >> It would appear that MIME types are redundant. After all a file >> extension can tell you what kind of data a file contains, right? >> True. However, when you send a file across the Web, a file looses its >> file extension, only the contents are sent, not the filename (or file >> extension). This is where MIME becomes important. When data is sent >> across the Web, it is sent as the payload of an HTTP message. In the >> HTTP header is a field called Content-type, and the value of this >> field is a MIME type, /e.g.,/ >> >> Content-type: *application/xml* >> >> Thus, when a Web server receives data it examines the Content-type >> header field to determine the type of data that is in the payload. >> >> *What is an XML Document?* >> >> Take this simple XML: >> >> *<?xml version="1.0"?>* >> >> *<root>* >> >> * Blah* >> >> *</root>* >> >> and put it into *Word* and give the document the extension “.xml”. As >> we’ve seen above the MIME type is: >> >> *application/xml* >> >> However, it is not an XML document. It is a *Word* document. >> >> Conversely, take the same XML and put it into *Notepad* and give the >> document the extension “.txt”. As we’ve seen the MIME type is: >> >> *text/plain * >> >> Yet, it is an XML document. >> >> So, what is an XML document? Answer: an XML document is one that has >> an XML declaration as the first thing in the file. If you open the >> above *Word* document you won’t find an XML declaration as the first >> thing. If you open the above *Notepad* document you will find an XML >> declaration as the first thing. >> >> *Further Information* >> >> Elliotte Rusty Harold has written an excellent article on this subject: >> >> http://www-128.ibm.com/developerworks/xml/library/x-mxd2.html >> >> *Acknowledgements* >> >> I would like to gratefully acknowledge the excellent inputs from >> these people: >> >> Mitch Amiano >> >> Rick Jelliffe >> >> Amelia Lewis >> >> Rick Marshall >> >> Dave Pawson >> >> Bryan Rasmussen >> >> Henri Sivonen >> >> Nathan Young >> >> > > > > > !DSPAM:448f4f44206747955413084! > >----------------------------------------------------------------- >The xml-dev list is sponsored by XML.org <http://www.xml.org>, an >initiative of OASIS <http://www.oasis-open.org> > >The list archives are at http://lists.xml.org/archives/xml-dev/ > >To subscribe or unsubscribe from this list use the subscription >manager: <http://www.oasis-open.org/mlmanage/index.php> > >!DSPAM:448f4f44206747955413084! > > begin:vcard fn:Rick Marshall n:Marshall;Rick email;internet:rjm@z... tel;cell:+61 411 287 530 x-mozilla-html:TRUE version:2.1 end:vcard
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|