Binary XML

How do I get my binary data into XML?
How do I get my binary data out of XML?

Ah, the age-old problem (well, at least since XML was invented) is how to embed binary data in XML? Never mind why; metaphysics is outside the scope of this document. We're going to tell you how to put it in, and how to get it out.

XML is not the idea carrier for binary data. It is a text format, and as such doesn't cope well with raw bits. But if binary data is properly encoded, using something like the W3C XML Schema types base64Binary or hexBinary, then using the XML Converters reading and writing binary files from XSLT and/or XQuery becomes a snap.

base64Binary, Base-64 and XML

We're going to use the Base-64 encoding format for this demonstration, since it packs tighter than hex encoding. But jump to the end to see hexBinary covered quickly.

Binary to XML





First, let's encode binary as XML. We'll take the shirt images that we used in the XML Report demonstration. Our source material will then be:

  • The shirt images
  • An XML file listing the names of the files
  • An XSLT transform that combines the two

The XML file will be very simple:

<?xml version="1.0" encoding="US-ASCII"?>
<list>
    <item>shirt-004.gif</item>
    <item>shirt-076.gif</item>
    <item>shirt-148.gif</item>
    <item>shirt-220.gif</item>
    <item>shirt-292.gif</item>
</list>

And the XSLT not much bigger:

<?xml version="1.0" encoding="US-ASCII"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" encoding="US-ASCII"/>
    <xsl:template match="/">
        <list>
            <xsl:apply-templates select="list/item"/>
        </list>
    </xsl:template>
    <xsl:template match="item">
        <file name="{.}">
            <xsl:value-of select="document(concat('adapter:Base-64?http://www.stylusstudio.com/images/publish/', .))"/>
        </file>
    </xsl:template>
</xsl:stylesheet>

So, how does it work? For each <item>, it wraps it in a new <file> and attaches an attribute named "name" holding the input file name. The content comes from the shirt file and through the Base-64 Deployment Adapter. That adapter takes any raw data and returns it as Base-64 encoded data, which is entirely compatible with XML. (Also note that I've made the reference to the files absolute and pointing to our server, if they were local you could just say <xsl:value-of select="document(concat('adapter:Base-64?', .))"/> without all of the pathiness.)

This makes the resultant XML document look like this:

<?xml version='1.0' encoding='US-ASCII' ?>
<list><file name="shirt-004.gif">
R0lGODlhZABqALMAAFrMYr/BvlKOVJKOg2xZUKmenMfDw8tgWJpVUbaxsPb19v///+bm5tfX1wAA
AAAAACwAAAAAZABqAAAE/3DJSau9mCrGWhkFIopF2TSMkq1s674rk4x0TRB1ETQq7P/AS6gmOhiP
Rxuu0Ag6nyzGCEmtVmkFqHa7MByK1rCV8OWagwUjTsw2FhG9s5ylQCRxtIF+gNjTvmtfDHOEGQ12
....lots more base-64 stuff....</file><file name="shirt-076.gif">
....lots of base-64 stuff....</file><file name="shirt-148.gif">
....lots of base-64 stuff....</file><file name="shirt-220.gif">
....lots of base-64 stuff....</file><file name="shirt-292.gif">
....lots of base-64 stuff....</file></list>

That R0lGODlhZABqAL stuff is the Base-64 equivalent of the binary data. Because any encoding of binary data where the target character set is restricted will mean there is some expansion, Base-64-encoded binary will be 33% larger than the raw data (for every three bytes in, four go out). Base-64 is pretty good; hex-encoding doubles the size (for every one byte in, two go out).

XML to Binary

Now we've got it in; how do we get it out? Supplying that same document we just created as input to this next XSLT document will give us the five shirts' worth of GIF files again. (And we've added a little bit to write out status messages while it is working so we know what we've created.)

<?xml version="1.0" encoding="US-ASCII"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:template match="/">
        <xsl:apply-templates select="list/file"/>
    </xsl:template>
    <xsl:template match="file">
        <xsl:text>Writing </xsl:text>
        <xsl:value-of select="@name"/>
        <xsl:text>&#10;</xsl:text>
        <xsl:result-document href="{concat('adapter:Base-64?',@name)}">
            <xsl:copy-of select="."/>
        </xsl:result-document>
    </xsl:template>
</xsl:stylesheet>

The key here is the xsl:result-document section. It opens a new document, and copies the contents of our Base-64 encoded field. Now, without the Base-64 Deployment Adapter, this would just get copied out to that file as a bit of XML. But, since we're now writing through this bi-directional adapter, it catches the XML, and takes all of the Base-64 it sees inside and turns it back into binary. So what persists on disk in the end is the actual GIF file.

hexBinary, Hex and XML

Exactly the same results can be achieved with hex-encoded data. Just use the Binary Deployment Adapter, which by default uses base-16 for encoding — just what the W3C XML Schema data type hexBinary expects.

Read and Write Binary XML

There isn't really any such thing as "binary XML", but even that won't stop you from mixing binary and text inside XML thanks to the Base-64 and Binary Converters. The target format doesn't even have to be a file; the adapter could be used in a web serving environment to feed images right from source XML. Since you control the code, there are no limits. Examine the adapters and the rest of Stylus Studio® 2009 XML Enterprise Suite Release 2 by downloading an evaluation copy and trying these samples today!

XML Schema Mapping

Stylus Studio includes a visual XML Schema-to-XML Schema mapping tool that allows you to easily implement sophisticated XML data mappings involving multiple data sources and customized data processing using either XSLT or XQuery code.

Java Code Generation

The Stylus Studio Java Codce Generator lets you generate deployable Java code for XQuery and XSLT at the push of a button!

FLWOR - An Introduction to the XQuery FLWOR Expression

An XQuery FLWOR Tutorial, covering an introduction to the main constructs of the XQuery FLWOR expression, including: For, Let, Where, Order By, and Return. Written by the W3C's Dr. Michael Kay.

EDI to XML Mapping in Stylus Studio

Learn how to use Stylus Studio's EDI to XML mapping tools in Convert-to-XML, including loading an EDI file, inspecting application control codes, customizing XML output and other advanced EDI to XML mapping features.

Stylus Most Wanted

 
Free Stylus Studio XML Training:
W3C Member