|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: UTF-8 Question: e with acute accent should require two byt
Hex editors show you what they've got in memory, not what's on the disk. So this tells you that the editor has converted the data to iso-8859-1 or something similar for processing in memory. Michael Kay http://www.saxonica.com/ > -----Original Message----- > From: Costello, Roger L. [mailto:costello@m...] > Sent: 28 September 2007 16:13 > To: xml-dev@l... > Subject: UTF-8 Question: e with acute accent should > require two bytes, right? > > Hi Folks, > > Consider this element: > > <title>My Resumé</title> > > Notice: é (the character "e" with an acute accent). It is U-00E9 > > Since its code point is greater than U+0080, it requires more > than one byte. > > Hex E9 = Decimal 233. This has the binary: 11101001 > > I believe that it is encoded in UTF-8 as two bytes: > > 11000011 10101001 > > These bytes correspond to hex C3 and hex A9. > > Thus, é should be encoded in UTF-8 as: > > C3A9 > > The code points of the other characters (My Resum) are all > less than U-0080, and so the UTF-8 encoding of those > characters should be only one byte. > > So, this is what I believe should be the bytes: > > M y R e s u m é > 4D79 2052 6573 756D C3A9 > > Do you agree? > > However, when I view the bytes in my hex editor I get this: > > M y R e s u m é > 4D79 2052 6573 756D E9 > > Notice that é uses only one byte. > > Something is wrong. Here's what I think may be wrong: > - the editor that I am using to display the hex values is > displaying the code points and not the hex values. However, I > have now tried two editors, and they both display the same > thing (E9). So perhaps the editor isn't the problem. > Perhaps I'm the problem, and am misunderstanding something. Help! > > /Roger > > > ______________________________________________________________ > _________ > > XML-DEV is a publicly archived, unmoderated list hosted by > OASIS to support XML implementation and development. To > minimize spam in the archives, you must subscribe before posting. > > [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/ > Or unsubscribe: xml-dev-unsubscribe@l... > subscribe: xml-dev-subscribe@l... List archive: > http://lists.xml.org/archives/xml-dev/ > List Guidelines: http://www.oasis-open.org/maillists/guidelines.php >
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Cast Your Vote
We need your help – Vote for DataDirect XML Products!
Winners and finalists announced at SOA World Conference in November. Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||







