[Home] [By Thread] [By Date] [Recent Entries]

  • From: Philippe Poulard <philippe.poulard@s...>
  • To: David Carlisle <davidc@n...>
  • Date: Thu, 20 Sep 2007 15:34:22 +0200

David Carlisle a écrit :
> There, if there is no external metadata or xml declaration the file has
> to be in utf16 or utf8, and the BOM is optional for utf8, so if the file
> has no BOM, then the parser does not "give up" The file is treated as if
> utf8 is specified.

The BOM has some sense only with 16-bits encoding charsets ; it 
indicates the endianness, that is to say which one of the 2 bytes is 
stored first

If the encoding it specified as UTF-16 without more indications about 
endianness, or not specified at all, the BOM -if present- will state if 
it is UTF-16le or UTF-16be (it seems that if the encoding is specified, 
it is a redundant information)

If the BOM is missing and the encoding not specified, the encoding is 
either UTF-16xx (I don't remember which one is the default :) ) or UTF-8

-- 
Cordialement,

               ///
              (. .)
  --------ooO--(_)--Ooo--------
|      Philippe Poulard       |
  -----------------------------
  http://reflex.gforge.inria.fr/
        Have the RefleX !


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member