[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Binary versus Text
Pretty much. That some sequence of bytes can be recognized by the parse rules for some encoding-and-character-set is necessary but not sufficient for the file to be 'text'. It could be an accident. We also have to know that it is supposed to contain characters as the initial layer. (For the BOM, I think they are characters that have been assigned a supplementary role. So not really an exception.) For record-based storage a la vms etc, an api may present a file as a virtual text file, but that does mean the file itself should considered be text rather than binary: same as zip. That edge cases of small files may not have enough information to make the call, does not mean their character is not clear when typical files are considered. Another example: an rtf file with only hex encoded images is a text file, because every byte maps to an intended character. But an rtf file with an embedded binary image should be considered a binary file, because those bytes are not first intended as characters: as the rtf 1.8 spec mentions, an rtf parser needs to understand \bin and lump the binary data together. Cheers On 27/11/2013 2:19 AM, "John Cowan" <johnwcowan@gmail.com> wrote:
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|