RE: UTF-8 use with XML
Thanks Tim, Here is the hex for <BirthCity>K?/BirthCity>: 3C 42 69 72 74 68 43 69 74 79 3E 4B EF BF BD 2F 42 69 72 74 68 43 69 74 79 3E EF BF BD are the questionable characters which replaced 3C. Craig -----Original Message----- From: Tim Bray [mailto:tbray@t...] Sent: Friday, June 13, 2003 11:16 AM To: Long, Craig Z Cc: xml-dev@l... Subject: Re: UTF-8 use with XML Long, Craig Z wrote: > Given the following element using a utf character (created by a user's > system): <BirthCity>Trenton?/BirthCity> I've been told my system should be > programmed to accept this. I can't find any documentation which supports > yes or no to this premise. Currently we reject this as not well-formed XML. > Please offer expertise concerning this issue. If it really contains a UTF8 character, no programming should be required, all conforming XMl software is required to accept UTF data. Things that could be wrong: - there's an encoding declaration at the front of the file saying it's something other than UTF-8 - you think it's UTF-8 but it isn't. If there's no encoding declaration, then the second is almost certainly true. If you provide a hex dump of the affected region there are several people here who could look at it and tell you whether it's really UTF-8 -- Cheers, Tim Bray (ongoing fragmented essay: http://www.tbray.org/ongoing/)
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format