[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: (char)0 handling proposal
Thanks for your comments Richard, > >The big problem with this approach is how to encode characters which > >were already in the range 0x7F - 0x9F... it might not happen often, > >but a bijective mapping (ie reversible) needs to be able to handle > >all cases! > > There are a number of other ranges that might be used. > The Unicode Private Use Area (codes E000 - F8FF) is an obvious > choice. There's also the "Control Pictures" area 2400-241F > which has the remarkable property that given a complete Unicode > font the control characters would actually be readable! But you get the same problem - you might need to encode any character (which in Java is 2 byte Unicode), and so shifting to some other range means that you can no longer use it to encode that range... I suppose the argument might be that if someone is using these these areas, then it *really is* binary data, and so then one would switch to a binary rendering. I think this is a nice and logical solution - do as you suggest, and map the control character to the Private Use Area or other; and if you encounter character values in that range already, only then switch the binary. The downside is in performance: you need to pre-parse the String to check for such unusual values before you can write anything. Cheers, Brendan -- e: bren@m... v: +61 (3) 9905 1502 Email is checked daily Phone is rarely attended
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|