[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: binary base64 definition
On Sat, 6 Jan 2001, Danny Vint wrote: > At 06:15 PM 1/6/2001 -0500, Jerry Johns wrote: > >Thanks much for your input. You guys are a tremendous resource. > > > >Following the suggestion of describing exactly what I'm trying to do, here > >it is: > > > >I'm trying to implement an interface in XML. The file has to contain some > >dollar amounts, which is straightforward. The file also needs to contain > >several "objects" that pertain to the dollar amounts. These objects include > >an HTML web page and JPEG images. My trading partner will use the dollar > >amounts for business logic and to store in database. The HTML and JPEG > >objects will be loaded into an imaging system. > > > >I could easily go with the approach of putting the JPEG images as separate > >files and FTP them along with the XML file. However, I was striving to keep > >everything in a single file. > > > There are some new messaging specs being worked on that would allow you to > use something like multipart MIME to wrap all the pieces together - but > this is real early work going on there. > > This has also been a problem in the SGML days as well but basically it > wasn't as big of a deal because everything we pretty much on the file > system with documents and it was only when you were exchanging the > information that you would want to create a single file. One option would be to grab some code that already handles MIME multipart/form-data -- then you could send the messages as an XML document plus non-xml MIME 'attachments'. I imagine that there is apache server codebase that does this for you. There is even a URI specification for referencing the 'parts' of the message. Kinda messy though. > > > >Obviously, the JPEG file contains special characters when looked at on a > >byte-by-byte basis. > > > >One method I'm considering is converting the JPEG file to base64, which > >results in a string of text characters, as you know. However, this data > >could also contain special characters, ie: greater-than symbol, less-than > >symbol, etc. > > base64 has always been recommended for doing what you want and from what I > understand it actually guarantees that you won't end up with any > troublesome characters. Indeed -- base64 encoding eliminates the less/greater than and ampersand charcters, leaving safe ascii. > > > >Assuming this was a good approach, I have two issues remaining: 1) how to > >code and decode the base64 and 2) can I prevent the DOM API from parsing the > >encoded JPEG data and converting greater-than and less-than symbols into > >"lt;" and "gt;" text strings. YOu;'d also have to escape the ampersands ... I think base64-ing the whole thing is likely the easiest to do. As Danny notes, there are lots of base64 encoders/decoders out there - I am sure a search would find you a perl or java package to do this. > 1) I belive there are some Perl modules for doing this, the description of > "how to do" base64 is in one of the RFCs which I might be able to find. > > 2) CDATA sections are another way to go, in these all you have to worry > about is a string of ']]>' - not sure if that helps any. But with a CDATA > section the parser is hands off except for that particular string. So this > wouldn't be a problem for the DOM, you might have problems reading it in > and then writing a CDATA section back out but you could use a standard tag > <base64> and always read and write a CDATA section around its content. > > 1) You will have to extend your DOM implementation to be able to recognize > the format and hook some code in to handle it - you won't find this off the > shelf but I would think it would be relatively straight forward to > implement. Probably the safest way and allows you to maintain one file. ... but you can probably find base64 encoders/decoders to make things easier for you. > 2) It isn't DOM you are fighting it is XML and its parser - CDATA is the > closest thing to doing this but it isn't a complete solution, using an > entity reference and an external file is the safe way from a parser/DOM > standpoint but you have the problem of multiple files. Yes ... the mime mechanism I mention is a whole other layre of messiness that you avoid by encoding stuff and putting it inside the XML. > > > >Can you please validate my approach and assumptions? Thanks! Jerry > > base64 is usually the first recommendation for doing what you want, the > cost is having to build the tool to do the work. To remove the work you can > use the entity method but then your stuck with multiple files. No magic > bullet here in XML, you just get a standard way of addressing the worlds > problems - but their still problems. I agree -- going the base64 route is I think the easiest approach, but there is no magic bullet that makes it trivial. Ian > ..dan > > > > >-----Original Message----- > >From: Danny Vint [mailto:dvint@s...] > >Sent: Saturday, January 06, 2001 4:28 PM > >To: Jerry Johns; 'Ian Graham' > >Cc: 'xml-dev@l...' > >Subject: RE: binary base64 definition > > > > > >Notations and other formats have always been application dependant since > >SGML days. In that arena we were primarily trying to use graphics and > >usually just display them. So the SGML editors provided a mechanism to map > >a tool to a format. Seems like you might be able to do something with OLE > >on windows for the same sort of functionality. I'm not sure what was being > >conveyed about the DOM support, but it would seem like there would be a way > >to hook into knowing what the NOTATION type was (base64 in this case) and > >launching some other application to deal with it. > > > >Maybe if you describe what the actually need for the base64 is we might be > >able to offer suggestions along other lines. > > > >..dan > > > > > >At 03:31 PM 1/6/2001 -0500, Jerry Johns wrote: > >>What if I ditched DOM and used another tool for managing the XML file; > >could > >>I then insert the base64 content and still be within the XML standards? Is > >>this a limitate of DOM? Thanks. Jerry > >> > >>-----Original Message----- > >>From: Ian Graham [mailto:igraham@i...] > >>Sent: Saturday, January 06, 2001 10:50 AM > >>To: Dan Vint > >>Cc: Jerry Johns; 'xml-dev@l...' > >>Subject: Re: binary base64 definition > >> > >> > >> > >>The DOM supports access to notation nodes, but can enforce no statement > >>aobut the proper encoding of a referenced external entity (which makes > >>sense, as it is external to the document). > >> > >>Base64 encoding of content inside a document would require custom code for > >>doing the encoding/decoding, and some attribute-based mechanism for > >>labeling the 'type' content of the node containing the data. That is > >>certainly possible, but as far as I can see is outside the scope of the > >>DOM. > >> > >>Ian > >> > >> > >>On Fri, 5 Jan 2001, Dan Vint wrote: > >> > >>> You can't use elements this way, but an alternative would be to create a > >>> NOTATION type and then an external entity of this type - you would copy > >>all > >>> the contents of whatever should be base 64 into this external file, it > >>> would be part of the XML document but it would be outside. Not sure if > >>> DOM has been setup to understand NOTATIONS, but in an SGML world you > >would > >>> be able to associcate a "processor" with that notation and have it called > > > >>> whenever you needed to read or write that format. > >>> > >>> Your DTD might look like the following: > >>> > >>> <!DOCTYPE .... [ > >>> > >>> <!NOTATION base64 SYSTEM "binary base64"> > >>> <!ENTITY extfile1 SYSTEM "extfile.b64" NOTATION "base64" > > >>> > >>> ]> > >>> .... > >>> &extfile1; > >>> ... > >>> > >>> The syntax is probably not exact but you can look up the details. > >>> > >>> ..dan > >>> > >>> > > >>> > In the DTD, can I specify a type of "binary base64" for an element so > >>that > >>> > when I write to the XML file using DOM, DOM will automatically encode > >>the > >>> > binary data for me without parsing for control characters? If so, I > >>assume > >>> > it will do the reverse when I read that element. > >>> > > >>> > Can anyone validate my assumption about the DTD data type? Has anyone > >>seen > >>> > an example DTD definition with this in it? > >>> > > >>> > Thanks much. > >>> > Jerry > >>> > > >>> > >>> > > > >--------------------------------------------------------------------------- > >Danny Vint > >http://www.dvint.com > > > >Author: "SGML at Work" > >http://www.slip.net/~dvint/pubs/sgmlatwork.shtml > > mailto:dvint@u... > > > > --------------------------------------------------------------------------- > Danny Vint > http://www.dvint.com > > Author: "SGML at Work" > http://www.slip.net/~dvint/pubs/sgmlatwork.shtml > mailto:dvint@u... > >
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|