[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Naive question about binary encodings

naive encoding
Thanks- this makes sense.  I was thinking that since attributes do, to
some extent, provide processing info anyway (like preserving whitespace
and specifying namespace) it wouldn't matter, but the transcoding issue
is one I totally spaced.

-----Original Message-----
From: Rick Jelliffe [mailto:ricko@a...] 
Sent: Tuesday, March 25, 2003 9:33 PM
To: xml-dev@l...
Subject: Re:  Naive question about binary encodings 

When an XML document is read, a transcoder converts it from a 
sequence of bytes to a sequence of characters (in the particular
encoding conventions used by the receiving system).

In order to embed binary data, you would have to transcode the
incoming data character-by-character and parse it token-by-token
to know which mode to read the contents of the next element.

However, it was exactly this kind of modal parsing that XML
is a rejection of: in SGML you have a lot of parsing modes,
depending on markup or on DTDs.  Also, it is very inefficient
to transcode character-by-character. 

Why not, then, transcode the document but keep the original bytes
and then use them?  Because a transcoder is supposed to fail
if an incorrect byte sequence is found, and binary data could 
easily contain them.  So we cannot really use this method,
unless we use some exception system that skips passed
bad encoded sections, but this seems pretty complicated.

Another reason is that we want to be able to open an XML file
in a suitable text editor, alter it, then save it. Binary data would
probably be corrupted.

Another reason is that arbitrary binary data contains 0x00.
If the data is being read into C's char type and manipulated 
as a string, for example, this will cause a problem.  

If we changed XML so that

- it was not a textual format
- it used only the ASCII encoding
- it does not have to allow implementations by null-terminated strings

then binary sections could be included.  But they are quite big changes:
removing the ML from XML!

Rick Jelliffe

----- Original Message ----- 
From: "Linda Grimaldi" <grimlinda@e...>
To: <xml-dev@l...>
Sent: Wednesday, March 26, 2003 1:46 PM
Subject:  Naive question about binary encodings 

I'm sure this is really naïve of me, but I have to admit that I don't
understand why binary data cannot be sent within an xml document by
providing appropriately namespace-controlled attributes that identify an
element type as binary and specify its length.  For example:

<foo xml:mytype="binary" xml:binarylength="1776"> 
or some such thing, followed by 1776 octets of whatever one likes.  It's
kind of like MIME in some respects- as long as its type and length are
identified, you can do whatever you like with it.

It's so obvious that I am sure it has been rejected already, but, being
a mere implementor, I don't quite understand why.


The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
initiative of OASIS <http://www.oasis-open.org>

The list archives are at http://lists.xml.org/archives/xml-dev/

To subscribe or unsubscribe from this list use the subscription
manager: <http://lists.xml.org/ob/adm.pl>

The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
initiative of OASIS <http://www.oasis-open.org>

The list archives are at http://lists.xml.org/archives/xml-dev/

To subscribe or unsubscribe from this list use the subscription
manager: <http://lists.xml.org/ob/adm.pl>


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.