[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

STAX Re: Abbreviated Tag Names

  • From: Rick Jelliffe <ricko@a...>
  • To: xml-dev@l...
  • Date: Tue, 23 Jan 2001 01:34:59 +0800

good tag names
If anyone is interested, there is a very simple compression possible called
STAX.
It could be built into XML parsers trivially, or be a separate layer to XML.
It converts

<?xml version="1.0"?>
<x>
  <y>aaaa</y>
  <y>aaaa</y>
</x>

to

<?stax?>
<?xml version="1.0"?>
<x>
 <y>aaaa</>
 <>aaaa</>
</x>
and does not need a stack or reserve big header space (it could have one,
e.g. a fixed size stack of the deepest 16 elements would be nice).

It would be best with documents with long names/data, repeated elements, and
fairly blunt nesting. Obviously it doesn't give great compression except for
those documents, and even then it cannot compare with binary.  Except there
are three other considerations: first, it does not compress to binary but
keeps the document as text (recoverable, readable, MIME email does not have
to bin64 encode), second, the code is trivial to implement on even a very
lightweight system (e.g., rolled into the parser,it is just an extra
transition or two); third one can use text processing tools (e.g. perl) to
perform the uncompression without going into a binary mode.

Obviously there are lots of other extensions possible, but I wanted to keep
SGML compliant (STAX is still SGML, caveat emptor) and avoid headers (to
keep streaming and lightweight.)

Fairly old source code for a compressor based on this (STAX
format=ShortTAgged Xml) is at
 http://www.ascc.net/~ricko/src/short-tag-compress.c
 http://www.ascc.net/~ricko/src/short-tag-uncompress.c

I think it would be good to have (something like) this kind of ultra-low-end
compression available (i.e. as a matter of compression negotiation), because
I think many servers are two busy to compress data going out (STAX can be
generated by the XML-generating API, and read directly into a SAX stream).

I think it would be useful to have several different compression methods
widely deployed to suit different situations-- STAX fitting into the extreme
low-end.

If anyone is interested in taking this further, I think it would be good.
And it is probably the kind of small infrastructure upgrades that could be
fun and doable for open-source and collaborative development.

Cheers
Rick Jelliffe


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.