[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Proposal for embedding octet-streams in XML (was: XML DTD...binary d

  • From: Chris Olds <colds@n...>
  • To: John Cowan <cowan@l...>, XML-Dev <xml-dev@i...>
  • Date: Fri, 19 Jun 1998 16:43:00 -0700

base64 overhead
I couldn't let this go...

John Cowan wrote:
> 
> Rick Jelliffe wrote:
> 
> > The most common notation to use is Base64. You can find base 64 specified
> > in an RFC.

http://www.faqs.org/rfcs/rfc1341.html

> > You can make a more efficient encoding by using all the available
> > characters. There are sevearal thousand, so you might want to invent your
> > own Base4K  encoding, for example, if it was really a big problem.

Base64 is designed to use characters that don't change depending on which page
of ISO 646 is used, and are represented consistently in all versions of EBCDIC
as well.  If you use more than 6 bits, you lose some of these properties.

> I propose a compromise: what might be called Base-256 encoding.

[details of the encoding snipped]

> Using this convention causes the data to be expanded by 2:1 in a UCS-2
> representation, by 3:1 in a UTF-8 representation, and by 7:1 in a
> numeric-character-reference representation.  Therefore, it is suitable
> only for relatively small amounts of octet data embedded in a basically
> textual matrix.

Umm....  This is only makes sense in a UCS-2 document.
Since Base-64 expands 3 bytes into 4 (ASCII) characters, encoding such a
document in UCS-2 would effectively expand 3 bytes into 8 bytes.  In that case,
the 2:1 penalty for shifting the byte into the UCS-2 private use zone is less
than encoding Base64 in UCS-2.
However, since all of the Base64 characters are 7-bit ASCII characters, the
Base64 overhead of 4:3 is much less than the UTF-8 representation of U+F0xx, and
even the UCS-2 representation of Base64 encoding (at 8:3) in smaller then the
7:1 a (UTF-8 or ISO Latin) character entity requires.

	/cco

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@i... the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@i... the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@i...)


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.