|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Packaging (was Re: Interoperability)
I'm reading this thread about packaging over month a late, but I'd like to throw out this idea on packaging. While zip/jar/xar is certainly a proven and well-understood technique it would be nice to have a text-based, human readable/editable standard that was simple enough could be implemented as extension to an xml processor with neglible impact to its code, memory, and processing footprint. Other than suggestions to use MIME packaging I'm not aware of any discussions of a format that meets the above requirements, so it might be useful to share these thoughts on what that could look like. The basic idea is to wrap file-level content in a stripped down xml-like syntax that has just one type of "element" which just contains CDATA (well, not even CDATA, just bytes). Metadata is associated with the content via this element's attributes, which may match the semantics of the equivalent mime or http headers. Here's an example: <?xpf http://www.somestandard.org/xpf/1/0 ?> <file:boundary content-location="file1.xml" content-type='text/xml' meta:another-attribute='foo'> <?xml version="1.0" ?> <?xml-stylesheet href="styles/style.css" type="text/css"?> <doc> sample doc </doc> </file:boundary> <file content-location="styles/style.css" content-type='text/css' content-length='35' meta:another-attribute='foo'> BODY { BACKGROUND-IMAGE: url(image/background.bmp) } </file> <file content-location='image/background.bmp' content-encoding='gzip'> [binary data, I wonder how this will show in email] § Ú yç©©§õ¾ àOÈSöà4&±Nèg%bÑäÿ8Rg ÈÅ4@OKÑxdÝÎ LY»?a¹á2äÆwÖX¿"©ìÕç`ÛA'ºr¸âÊ×i'|!Ü=c$õ0r¢W£É»ÏÖ®ÞX\õ.íÕçõú 0wJÆ ø3 w.õÒ)?~§l±e)ô6lÎÆ ?IFu @WÄ The first line contains the header that identifies what kind of content this bag of bits is. The "xpf" stands for extensible packaging format (any better ideas for the name?), while the URI that follows specifies the particular version of this format. Next are the file elements. This example has three <file> elements that correspond to three interrelated files, each element showing three different ways to define the element's boundaries. If the element has an (arbitrary) string appended to its name (e.g. file:boundary) the element data will end when it encounter the end tag (in this case "</file:boundary>"). If no boundary string is specified in the element name, the end of the element data is found by skipping ahead by the number of bytes specified by the content-length attribute. If neither is specified it is assumed that the rest of this file is this element's data (as shown by the last element). The rest of the example should be fairly obvious -- metadata about each file appears as the file element attributes and an appropriate set of mime headers are spelled as attributes and retain their meanings as a specified in the various RFCs. The files in the package can be references through the use of the content-location mime header as specified in the MHTML RFC. This specification would be orthogonal to any manifest or catalog (such XPackage, etc.) -- they would be just be stored as another file in the package. But there should be an attribute (e.g. xpf:manifest) to indicate that the file can be treated as a manifest for this package. This format can be easily streamed and provides random access as long the content-length attribute is specified for each file. If the processor supports gzip content-encoding it offers compression comparable to zip. One limitation as this is currently described is the lack of index of the files in the package. Having the manifest orthogonal makes validation of file references and efficient random access difficult, so it would make sense to define optional index elements that contain the minimum information about the file (content-location and content-length). Well, that's a brief description; I could go into more details about my thoughts about its syntax, encoding issues, etc. but this email is long enough. Actually this evolved from an idea for a standard header format for embedding metadata that occurred to me while trying to piece together data files from a scrambled hard drive -- if anyone's interested I could describe that related idea. -- adam
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Cast Your Vote
We need your help – Vote for DataDirect XML Products!
Winners and finalists announced at SOA World Conference in November. Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||







