[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: UTF-8+names
Tim Bray wrote: > James Clark wrote: > > > But with +names you don't want to work at the encoding level. For > > example, if you have a ü in your text file, that will be > two bytes in > > UTF-8+names, but you would want to work with it as a single > character. > > To edit a UTF-8+names text file, you need to make your text editor > > treat it as if it were encoded in UTF-8. In other words, to make > > things work you have to edit it in the wrong encoding. > This will be > > extremely confusing to users. > > I'm not sure I agree. In UTF-8+names, ü could show up either > as itself > as ü The point is how you make it show up as ü You normally don't see on the screen the bits and bytes of the encoding of a character, you see some display form of the encoded character. & u u m l ; is the UTF-8 re-interpretation of a UTF-8+names bit pattern. If this reinterpretation doesn't take place, the human user will not see & u u m l ; on her screen - she will see some display form of the LATIN U WITH DIAERESIS character. Editors would have to be modified - and in ways that affect their processing model - to be able to handle this. Alessandro
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|