[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Binary-encoding of XML for communication

  • From: "Joshua E. Smith" <jesmith@k...>
  • To: "XML Developers' List" <xml-dev@i...>
  • Date: Mon, 20 Sep 1999 11:00:37 -0400

binary encoding of xml document
I looked at the WAP spec, and the subsequent comments on this list and have
concluded:

1) Yeah, a binary spec for XML is a cool idea after all.  If nothing else,
we can probably parse a binary rep faster than we can parse text.

2) The WAP spec does not seem to have any guiding principle for making the
transition from text to binary.  In particular, tossing out comments with
the bathwater is a strange choice.  Also, giving themselves a special set
of enumerations for DTDs is politically curious.

3) I wouldn't be surprised if a document encoded in their binary format
ended up *bigger* than a text XML doc rammed thru zlib (their use of octets
and 32-bit integers is going to lead to lots and lots of 0 bits).  Is LZ
decompression a problem in embedded devices or something?

4) This spec is a lot closer to a network protocol than it is to the XML
spec, and, IMHO, it should be an IETF RFC, not a W3C Rec.

Anyone agree?

I propose we small-fry developers could do the following:

A) Decide *why* we want a binary XML spec, including rationale for that
decision
B) Produce an elegant spec and a reference implementation in C and java
C) Use IETF or a similarly open forum to promulgate it

I'm willing to step up to take the lead on this, although I'd happily back
off and let someone else take the reigns.  I think this can help with both
download size and startup time issues with my company's product, so I'm
motivated to work on it.

With your permission, I'll take a crack at step A (using my best
approximation of the funny language of specs):

<Preamble>

The binary XML format specification, hereafter referred to as XML-bin is
required to reduce the transmission size of XML documents, to speed
processing of those documents, and to reduce the size and complexity of XML
parser software.  (For purposes of this specification, the existing XML
specification will be referred to as XML-text.)

The XML-bin format specification shall be a lossless encoding of a textual
XML document.  That is, a document can be translated from XML-text to
XML-bin and back an arbitrary number of times, and no information content
will be lost.  Information content, in this sense, excludes those
properties of the text which are defined as "insignificant white space" in
the XML specification [anything else we need to exclude here?].

<Rationale>
The motivation for adjusting the machine representation of XML should be
expressed in the terms of computing machinery.  Allowing this effort to
attempt to change the rules of what should be in an XML document (e.g., the
WAP attempt to banish comments), or to fix some bigger issues (e.g.,
allowing more expressive DTDs) would doubtless interfere with acceptance of
this specification as a standard.
</Rationale>

</Preamble>

How's that?  The obvious (to me, anyway) way to implement that is to choose
a reasonable binary representation of a parse tree -- the way many
programming language compilers store data between their front-and and
back-end processes.  Maybe a string table followed by a binary dump of a
heap (a tree stored in a vector, for those of you who never took a data
structures course), all rammed thru zlib to compress out common patterns.

But before we decide on the implementation, we need to reach consensus on
the motivation.  Did I capture it?


-Joshua Smith


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@i... the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@i... the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@i...)



PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.