[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: binary base64 definition

  • From: Amy Lewis <amyzing@t...>
  • To: Jerry Johns <Jjohns@c...>, xml-dev@l...
  • Date: Sat, 06 Jan 2001 19:29:22 -0500

jdk base64
On Sat, Jan 06, 2001 at 06:15:24PM -0500, Jerry Johns wrote:
>Following the suggestion of describing exactly what I'm trying to do, here
>it is:
>
<snip />
>
>One method I'm considering is converting the JPEG file to base64, which
>results in a string of text characters, as you know. However, this data
>could also contain special characters, ie: greater-than symbol, less-than
>symbol, etc. 
>
>Assuming this was a good approach, I have two issues remaining: 1) how to
>code and decode the base64 and 2) can I prevent the DOM API from parsing the
>encoded JPEG data and converting greater-than and less-than symbols into
>"lt;" and "gt;" text strings.

You don't mention the language or tools that you're planning on using
for generating either the XML or the BASE64 encodings.  Whether an
encoder/decoder is available (there are several for Java, though none
in the public part of the JDK) depends on the choice of tools.  I would
be very surprised to find that any language or toolkit that has been
used for network-related technologies lacks an encoder and decoder
(though it might well be hidden, like Sun's in the JDK, or extra, like
IBM's Java implementation).

As for escaping special characters, you've been sadly led astray by the
idea that BASE64 encodes 3 eight-bit characters into four eight-bit
characters, using an alphabet that *could* be expressed in six bits. 
It is not, in fact, a particularly six-bit alphabet (although it does
avoid anything 128 and higher and everything 32 and under).  But it
uses the bottom seven bits to express the (almost) six-bit capable
encoding (it is actually an alphabet of 65 characters; "=" has the
special meaning of padding).  It also permits any "white space
characters," which it ignores (CR, LF, tab, space).  Specifically, the
alphabet is A-Z, a-z, 0-9, +, / (and =).  None of these are problematic
either to XML or to most network protocols using the NVT.  If your tool
set doesn't contain an encoder/decoder, the specification should be
sufficient to develop one.  And just think ... by design, you even get
support for EBCDIC!  Wow!

All of this information is easily available; a net search should soon
lead you to RFC 1521 (section 5.2) (or, for that matter, RFC 1421
section 4.3.2.4, although it doesn't call it "base 64").

Once you've decided to do this, and figured out what your schema is,
you can simply add an 'encoding="BASE64"' attribute, or some such.  You
might want to tell the tool to leave the line breaks alone (and not to
indent), but it shouldn't matter unless you have a tool that can't cope
with the plain old stream (the spec says that lines are no more than 76
characters each).

Amy!
-- 
Amelia A. Lewis          alicorn@m...          amyzing@t...


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.