Re: Naive question about binary encodings
When an XML document is read, a transcoder converts it from a sequence of bytes to a sequence of characters (in the particular encoding conventions used by the receiving system). In order to embed binary data, you would have to transcode the incoming data character-by-character and parse it token-by-token to know which mode to read the contents of the next element. However, it was exactly this kind of modal parsing that XML is a rejection of: in SGML you have a lot of parsing modes, depending on markup or on DTDs. Also, it is very inefficient to transcode character-by-character. Why not, then, transcode the document but keep the original bytes and then use them? Because a transcoder is supposed to fail if an incorrect byte sequence is found, and binary data could easily contain them. So we cannot really use this method, unless we use some exception system that skips passed bad encoded sections, but this seems pretty complicated. Another reason is that we want to be able to open an XML file in a suitable text editor, alter it, then save it. Binary data would probably be corrupted. Another reason is that arbitrary binary data contains 0x00. If the data is being read into C's char type and manipulated as a string, for example, this will cause a problem. If we changed XML so that - it was not a textual format - it used only the ASCII encoding - it does not have to allow implementations by null-terminated strings then binary sections could be included. But they are quite big changes: removing the ML from XML! Cheers Rick Jelliffe ----- Original Message ----- From: "Linda Grimaldi" <grimlinda@e...> To: <xml-dev@l...> Sent: Wednesday, March 26, 2003 1:46 PM Subject: Naive question about binary encodings I'm sure this is really naïve of me, but I have to admit that I don't understand why binary data cannot be sent within an xml document by providing appropriately namespace-controlled attributes that identify an element type as binary and specify its length. For example: <foo xml:mytype="binary" xml:binarylength="1776"> or some such thing, followed by 1776 octets of whatever one likes. It's kind of like MIME in some respects- as long as its type and length are identified, you can do whatever you like with it. It's so obvious that I am sure it has been rejected already, but, being a mere implementor, I don't quite understand why. Thanks, Linda ----------------------------------------------------------------- The xml-dev list is sponsored by XML.org <http://www.xml.org>, an initiative of OASIS <http://www.oasis-open.org> The list archives are at http://lists.xml.org/archives/xml-dev/ To subscribe or unsubscribe from this list use the subscription manager: <http://lists.xml.org/ob/adm.pl>
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format