[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Attributes as tags in namespaces and how to guess characte


thunderbird character encoding
Martin Olsson wrote:

> --- QUESTION 2
> 
> XML files can use different character encodings including UNICODE and 
> normal ascii text files. An XML parser must know what encoding is used 
> before it starts to process the file, loading a UNICODE file is very 
> different from loading a normal text file. The parser can obviously not 
> first read the encoding attribute of the XML declaration which is the 
> first line of the XML file and then load the file. 

On the contrary, the xml declaration is entirely in ascii except for a 
possible byte order mark, so the processor can determine 8-bit vs. 
16-bit encodings from the BOM and the <?xml, and then read the encoding 
declaration, knowing that it is in ascii.

> ... Should the XML parser use a brute force approach and try all of these?

THe only problem would come if the actual encoding does not match the 
declared encoding (I am leaving aside those cases where the processor 
knows the encoding by some other means).  The processor is not expected 
to sort out such discrepancies.

XML is one of the few formats out there that can handle multiple 
encodings and unicode decently, and much of this is due to the xml 
declaration.

Cheers,

Tom P

-- 
Thomas B. Passin
Explorer's Guide to the Semantic Web (Manning Books)
http://www.manning.com/catalog/view.php?book=passin

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.