[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Proposal for embedding octet-streams in XML (was: XML DTD...binary data)
Rick Jelliffe wrote: > The most common notation to use is Base64. You can find base 64 specified in > an RFC. > > You can make a more efficient encoding by using all the available > characters. There are sevearal thousand, so you might want to invent your > own Base4K encoding, for example, if it was really a big problem. I propose a compromise: what might be called Base-256 encoding. To embed a stream of arbitrary octets into an XML document, they should appear as the #PCDATA content of a suitable element. Each octet from 0-255 is encoded using the Unicode Private Zone characters U+F000-U+F0FF respectively. These characters are conveniently located in the middle of the Private Zone. Using this convention causes the data to be expanded by 2:1 in a UCS-2 representation, by 3:1 in a UTF-8 representation, and by 7:1 in a numeric-character-reference representation. Therefore, it is suitable only for relatively small amounts of octet data embedded in a basically textual matrix. -- John Cowan cowan@c... e'osai ko sarji la lojban. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|