[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Embedding image files in XML using data URLs

  • From: "Costello, Roger L." <costello@mitre.org>
  • To: "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>
  • Date: Wed, 29 Nov 2017 16:03:51 +0000

Embedding image files in XML using data URLs

Hi Folks,

 

You are familiar with URLs such as this:

 

http://www.example.org

 

That URL uses the http scheme.

 

There are other schemes in addition to the http scheme. One such scheme is the data URI scheme. Let’s look at it. But first, let’s motivate its use.

 

Motivation for the Data URI Scheme

Oftentimes a web page wants to display an image. One way to accomplish this is to provide, in the HTML, the URL to the image file. Like this:

 

<h2>Link to an image</h2>

<img alt="image"
         src
="https://upload.wikimedia.org/wikipedia/commons/3/38/JPEG_example_JPG_RIP_001.jpg"></img>

 

When that HTML is dropped into a browser, the browser displays this:

 

 

A disadvantage of that approach (linking to an image file) is the browser must download the web page (the HTML file) and then download the image file. Two sequential fetches. That’s expensive.

 

Alternatively, you can use a data URL in your web page. The data URL contains the image, inlined as base64-encoded text. I used this online tool to encode the image to base64 and then replaced the link with the base64 text of the image. Here’s what the HTML looks like now, using a data URL:

 

<h2>Inline the image using a data URL</h2>

<img alt="image" src=""></img>

 

When that HTML is dropped into a browser, the browser displays the same image as before:

 

Although the base64 text increases the size of the HTML file, the browser needs only do one fetch. That can be more efficient.

Disadvantage of Data URLs

A downside of data URLs is that it can circumvent certain detection methods and filtering. Consider this: a company wants to prohibit images from certain web sites. The company has a blacklist of links to those image files. Whenever the company’s firewall sees a web page containing one of those links, the firewall removes the link, thereby preventing the image from being downloaded. With a data URL, however, the image is smuggled in by the web page without notice because the data URL does not identify the location of the image.

Data URLs in XML Documents

A data URL is a URL. XML schema has a datatype for URL values – the xs:anyURI datatype. Can an XML element that is declared to be of type xs:anyURI hold a data URL? Let’s see.

I created an XML schema and declared an element to be of type xs:anyURI:

<xs:element name="Image" type="xs:anyURI" />

 

The value of the <Image> element in the following XML instance is an ordinary URL:

 

<Image>https://upload.wikimedia.org/wikipedia/commons/3/38/JPEG_example_JPG_RIP_001.jpg</Image>

 

That instance is schema-valid.

Next, I replaced that ordinary URL with a data URL. For brevity, I elided most of the data URL:

<Image> … //9k=</Image>

 

That also validated.

 

Conclusion

An xs:anyURI value can be a data URL. An xs:anyURI value can either link to external data or it can inline the data using a data URL.

 

A data URL in an XML instance document exposes the recipient of the XML to the risks described above. Namely, link filters can be circumvented and undesirable data can be smuggled in without notice.

 

To prevent an xs:anyURI value from containing a data URL, use the xs:pattern facet to constrain the value.

 

Comments?

 

/Roger



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.