[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Document encodings
Yes. There are a succession of features looked at, one after another until a fixed result is determined. 1) EXTERNAL: Information sent in the MIME header 2) BOM: Presence or absense of Byte Order Mark (BOM) which is a Unicode signal that allows you to know if you are using 16 or 32 bit characters, and the "endianness" 3) FAMILY SIGNATURE: Presence of expected codes at the beginning of the file (enough to know whether 8 bit codes are used, and if they are ASCII-based or EBCDIC-based) for "<?xml" 4) ENCODING: knowing the family signature is enough to read the encoding parameter of the XML header. 5) DEFAULT: otherwise UTF-8 (which also encompasses ASCII) The important thing is that this is not guesswork. There is no scope for one parser determining one encoding and another parser determining another encoding: all XML processors should be able to say "Yes I can handle this entity" or "no I cannot handle this entity". All processors are required to support UTF-8 and UTF-16 encodings. There are some character sets which have some instability about them: see http://www.w3.org/TR/japanese-xml/ but this is an exception. Cheers Rick Jelliffe ----- Original Message ----- From: "Phil Ruelle" <philr@i...> To: <xml-dev@l...> Sent: Friday, 6 July 2001 PM 04:16 Subject: Document encodings > A quick question: > > How do parsers work out what encoding an XML document is in > (i.e. how is it able to read the 'encoding' attribute of the > declaration)? > > I'm guessing that all the encodings XML supports have a common > 'root' so the XML declaration can always be read using the 'base' > character set. Is this correct or am I way off the mark? > > Many thanks, > > Phil Ruelle > > ------------------------------------------------------------------ > The xml-dev list is sponsored by XML.org, an initiative of OASIS > <http://www.oasis-open.org> > > The list archives are at http://lists.xml.org/archives/xml-dev/ > > To unsubscribe from this elist send a message with the single word > "unsubscribe" in the body to: xml-dev-request@l...
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|