[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] STAX Re: Abbreviated Tag Names
If anyone is interested, there is a very simple compression possible called STAX. It could be built into XML parsers trivially, or be a separate layer to XML. It converts <?xml version="1.0"?> <x> <y>aaaa</y> <y>aaaa</y> </x> to <?stax?> <?xml version="1.0"?> <x> <y>aaaa</> <>aaaa</> </x> and does not need a stack or reserve big header space (it could have one, e.g. a fixed size stack of the deepest 16 elements would be nice). It would be best with documents with long names/data, repeated elements, and fairly blunt nesting. Obviously it doesn't give great compression except for those documents, and even then it cannot compare with binary. Except there are three other considerations: first, it does not compress to binary but keeps the document as text (recoverable, readable, MIME email does not have to bin64 encode), second, the code is trivial to implement on even a very lightweight system (e.g., rolled into the parser,it is just an extra transition or two); third one can use text processing tools (e.g. perl) to perform the uncompression without going into a binary mode. Obviously there are lots of other extensions possible, but I wanted to keep SGML compliant (STAX is still SGML, caveat emptor) and avoid headers (to keep streaming and lightweight.) Fairly old source code for a compressor based on this (STAX format=ShortTAgged Xml) is at http://www.ascc.net/~ricko/src/short-tag-compress.c http://www.ascc.net/~ricko/src/short-tag-uncompress.c I think it would be good to have (something like) this kind of ultra-low-end compression available (i.e. as a matter of compression negotiation), because I think many servers are two busy to compress data going out (STAX can be generated by the XML-generating API, and read directly into a SAX stream). I think it would be useful to have several different compression methods widely deployed to suit different situations-- STAX fitting into the extreme low-end. If anyone is interested in taking this further, I think it would be good. And it is probably the kind of small infrastructure upgrades that could be fun and doable for open-source and collaborative development. Cheers Rick Jelliffe
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|