[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: UTF-8 Question: e with acute accent should requiretwo byte
Jonathan Robie wrote: > Hi Roger, > > UTF-8 uses an 8 bit encoding. E9 fits in 8 bits. It doesn't fit in 7, > but there's no such thing as UTF-7, the problem you refer to is an ASCII > 7-bit problem. Since 8 bits represents twice as many characters as 7 > bits, it's enough to represent most European languages using one byte > per character. > > Jonathan Ahem, this is either incorrect or at least expressed in a confusing way. UTF-8 uses sequences of bytes (of 8 bits). As UTF-8 can encode all Unicode code points, most of them -- all characters with code points >= 128 -- need two or more bytes. So no, although E9 fits into 8 bits, it's UTF-8 encoding requires more than one byte. BR, Julian
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|