[Home] [By Thread] [By Date] [Recent Entries]
> -----Original Message----- > From: Seairth Jacobs [mailto:seairth@s...] > Sent: Wednesday, April 02, 2003 13:49 > To: xml-dev > Subject: BASE64 (was Re: CDATA) > > > I still don't see why a <![BASE64[ ]]> isn't added. > > 1) Nothing needs escaping. > > 2) The encoded form falls neatly into all content encoding > forms (I think), so parsers don't have to switch between > "character" and "octet" hats. > > 3) When someone asks "how do I handle binary?", the answer > would be a flat "<![BASE64[ ]]>" instead of "Well, can do > this... or this... or this... and you are responsible to all > encoding/decoding". I suspect much less grumbling will occur. > > 4) For anyone arguing that it causes bloat: why are you using > XML in the first place then? > > 5) It's a clean, simple, and well-used technique. > > 6) It's about as 80/20 a solution as I can think of. > > So why not add it? I think this **might** work, but... It would require an important conceptual change in the Infoset specification, because none of the current children of the element information item is able to hold a binary octet. Should a new information item, as a new kind of child, be introduced? Probably a single octet, or an octet string. The character information item cannot be used because it holds an ISO 10646 character, not a binary octet. Even with the XML 1.1 provision allowing control characters in XML (as numeric character references), control characters are still ISO 10646 characters, not binary octets - and the NUL character is not permitted anyway. Moreover, if an XML document is converted to an infoset and then back to XML, we will want those BASE64 sections back, we won't want them changed into a series of meaningless characters or numeric character references. Also, we don't want a sequence of characters or numeric character references to be changed into a BASE64 section. These seem good arguments for introducing a distinct information item, if a BASE64 section is introduced in XML. What about schema languages? They don't know anything about octet information items. How would such data be described/constrained? Intuitively, an xsd:base64binary or an xsd:hexBinary should validate an element that contains a BASE64 section and nothing else (or should they?). But clearly, in this case the datatype is not constraining the child characters of the element information item, but its child binary octets. So there are conceptual changes here also. There may be a lot more of complications down the road, in these or other areas. Alessandro Triglia > > --- > Seairth Jacobs > seairth@s... > > > ----------------------------------------------------------------- > The xml-dev list is sponsored by XML.org > <http://www.xml.org>, an initiative of OASIS <http://www.oasis-open.org> The list archives are at http://lists.xml.org/archives/xml-dev/ To subscribe or unsubscribe from this list use the subscription manager: <http://lists.xml.org/ob/adm.pl>
|

Cart



