[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: UTF-8 Question: e with acute accent should require two byt
Not sure what editors you are using, but I used Eclipse to create a UTF-8 file, then viewed it using "od -t x1". My file says "My résumé". It is output by dump as follows: 0000000 4d 79 20 72 c3 a9 73 75 6d c3 a9 The é is encoded as c3a9. Which editor did you use to originally create the file? -Erik -----Original Message----- From: Costello, Roger L. [mailto:costello@m...] Sent: September 28, 2007 11:13 AM To: xml-dev@l... Subject: UTF-8 Question: e with acute accent should require two bytes, right? Hi Folks, Consider this element: <title>My Resumé</title> Notice: é (the character "e" with an acute accent). It is U-00E9 Since its code point is greater than U+0080, it requires more than one byte. Hex E9 = Decimal 233. This has the binary: 11101001 I believe that it is encoded in UTF-8 as two bytes: 11000011 10101001 These bytes correspond to hex C3 and hex A9. Thus, é should be encoded in UTF-8 as: C3A9 The code points of the other characters (My Resum) are all less than U-0080, and so the UTF-8 encoding of those characters should be only one byte. So, this is what I believe should be the bytes: M y R e s u m é 4D79 2052 6573 756D C3A9 Do you agree? However, when I view the bytes in my hex editor I get this: M y R e s u m é 4D79 2052 6573 756D E9 Notice that é uses only one byte. Something is wrong. Here's what I think may be wrong: - the editor that I am using to display the hex values is displaying the code points and not the hex values. However, I have now tried two editors, and they both display the same thing (E9). So perhaps the editor isn't the problem. Perhaps I'm the problem, and am misunderstanding something. Help! /Roger _______________________________________________________________________ XML-DEV is a publicly archived, unmoderated list hosted by OASIS to support XML implementation and development. To minimize spam in the archives, you must subscribe before posting. [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/ Or unsubscribe: xml-dev-unsubscribe@l... subscribe: xml-dev-subscribe@l... List archive: http://lists.xml.org/archives/xml-dev/ List Guidelines: http://www.oasis-open.org/maillists/guidelines.php [Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|