[Home] [By Thread] [By Date] [Recent Entries]
Andrew Welch wrote:
No, that's not it. The codepoints are encoded using literal numeric hexadecimal strings (compare 
 in XML, which would be [0A] in the original example)
Well, that's encouraging ;) The project contains strings that are "escaped" in several ways (texts are literal): C-style: \x0ASome text \x22between quotes\x22 Local style: <0A>Some text <22>between quotes<22> Other style: Text with <22,24,54> multiple special chars XML-like: &0A;Some text &22;between quotes&22; In short: the input is rubbish. But we know for a fact how to get the codepoints. However, in the past, users have made mistakes. The original application simply ignored those mistakes, replacing the illegal codepoints with nothingness. The good news is: all codepoints are Unicode codepoints. Thanks, -- Abel
|

Cart



