|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: (char)0 handling proposal
> >"Now is the time for all good men to come to the aid of the party@@@@@@@@@" > >"@" is the null char - when a String is *mostly* text, it would be nice to > >render the readable text as human readable... > Create a new simpleType quotedPrintable, then you can have > > "Now is the time for all good men to come to the aid of the > party=00=00=00=00=00=00=00=00=00=00=00=00=00=00=00=00=00=00" > > where the string is converted to UTF-16LE before applying QP. This is as > human readble as possible. But please note that wouldn't be a very > interoperable solution and I discourage such multi-level encodings. I agree about multi-level encodings, but it does seem the only way to cater for both human consumption and binary data. > If it's binary don't use XML (directly) or use the mentioned types. Who > cares about human consumption _and_ uses binary data? It's because the "char" datatype of Java is ambivalent. It usually contains Unicode, but it can also be treated as a 16 bit unsigned integer.[*] More and more I think you are right, that if a String does contain non-text values (by the XML definition), then it should be treated entirely as binary. Incidentally, a way to serve both the concerns of human consumption and binary data is to render binary in this strangely familiar format: <Binary> 0000000: 4e6f 7720 6973 2074 6865 2074 696d 6520 Now is the time 0000010: 666f 7220 616c 6c20 676f 6f64 206d 656e for all good men 0000020: 2074 6f20 636f 6d65 2074 6f20 7468 6520 to come to the 0000030: 6169 6420 6f66 2074 6865 2070 6172 7479 aid of the party 0000040: 0000 0000 0000 0000 000a @@@@@@@@@. </Binary> This kind of format is *the most* human readable way to present binary data. It can be edited effectively via the hex representation, and the text representation is "read-only" (a kind of markup of the real data). The addresses on the left are a non-XML markup - but this could be done in an XML style, eg: <bin addr="0000000"> 4e6f 7720 6973 2074 6865 2074 696d 6520 </bin> or <b a="0000000" t="Now is the time ">4e6f 7720 6973 2074 6865 2074 696d 6520</b> (Based on an idea by Mark Collette, for using hex to represent binary in XML) Cheers! Brendan -- e: bren@m... v: +61 (3) 9905 1502 Email is checked daily Phone is rarely attended [*] As the XML definition of "text" grows more important, it would be nice if languages had a primitive datatype for "textchar" or "XMLchar". This avoids the need to check the range of values it contains. But I guess there are many reasons to use primitives that are a multiple of 8 bits in length (exception: boolean, but it doesn't require extra validation checks.)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








