[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: NY Times reference to 'secret coding'
On Tue, 4 Sep 2007 19:18:29 -0400, noah_mendelsohn@u... wrote: >There are often at least two levels of concern when considering >compatibility of an office-style file format: 1) given the published >specifications and an arbitray document instnace, can you extract the >general semantics -- for example, can you tell that a given page is in two >column format with a footnote at the bottom vs. 2) can you tell to the >exact pixel how that document is to be printed on some particular printer, >and can you predict for that printer exactly how long the footnote can be >before it wraps to a second page? Well put. IMO, the second point is more a consideration for printer drivers than for interchange, but there are indeed multiple levels. For example, take the RTF spec, with which I am intimately familiar. At first reading, it seems complete, and pretty simple to write to. However, there are at least three kinds of problems: missing pieces, unexpected interactions between codes, and instability. Missing: The spec describes syntax for "fields", which is accurate as far as it goes. But since the main practical use of RTF is for interchange with Word, you also need to know what fields are used for what purposes, and what switches apply to them. It's not there. You need to use Word to make instances and save as RTF, and then reverse engineer. So you have a published spec that is ineffective without also knowing the proprietary extensions. That's not obvious to someone reading the spec, but an implementor finds out fast. Interactions: The spec says that a control word, like \par, is terminated by the first whitespace or punct after it. But that is not how *some* work. Which ones? Reverse engineer some more. Hint: you won't get footnotes working right unless you do that. When you get to issues like how to wrap an xref in a hyperlink, the code needed gets Byzantine, and the spec offers no guidance. Instability: In Word 95, a WMF graphic used units of twips (1440 per inch) for many purposes; this was built in, not settable. Then in Word 97, the unit changed to himetric (2540 per inch), still with no indication of its value. So RTF made for Word 95 produced shrunken graphics in Word 97. In Word XP (2002), the unit changed back, still with no indication in the RTF. Sure, the spec described it, but compatibility was broken both ways. Similar tweaks happen with other settings. I could go on about the "PDF killer", XPS, but this is long enough already... ;-) I'll just mention that the spec is *in* XPS, so you need the viewer to read it, and when you go to download the free viewer, you are asked to accept a fascinating agreement first. I read enough to know I would never sign it. On a hunch, I changed the extension to .zip, unipped, and looked at the results, also interesting. I sure wouldn't vote for it as a standard... -- Jeremy H. Griffith, at Omni Systems Inc. <jeremy@o...> http://www.omsys.com/
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|