[Home] [By Thread] [By Date] [Recent Entries]

  • From: Tim Bray <tbray@t...>
  • To: xml-dev@l...
  • Date: Fri, 26 Oct 2001 08:18:56 -0700

At 02:28 PM 26/10/01 +1000, Rick Jelliffe wrote:
> From: "Bjoern Hoehrmann" <derhoermi@g...>
> 
>> So, who tells me I
>> am wrong and text/xml documents without charset parameter may still be
>> UTF-8 encoded (and use non-ASCII characters)? ...
>
>The only ways out of encoding hell are:

Actually, XML *improves* the situation.  To quote Larry Wall,
"An XML document knows what encoding it's in".  So, given the
(not uncommon) scenario of mime-header breakage, you can often
recover.  A decent XML processor, given a stream of bytes and
no other information, almost always does the right thing.

Per IETF dogma, the XML spec and the RFC both say that the
charset header is authoritative.  Well, yes, except when it
isn't.  Software that ignores it when it's demonstrably 
wrong is hard to get too angry at. -Tim


Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member