[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: encoding problem fixed
David Brownell wrote: > Actually, that's not correct either. My general advice is to pass a > URI to the parser -- which is required to do the correct thing! -- and > in those rare cases that can't be done: > > * If the data is externally typed according to character set, > you MUST use some Reader ... e.g. given a MIME type of > "application/xml;charset=Big5", then use a reader set > up to use the "Big5" encoding (a Chinese encoding). There > isn't much choice of classes; InputStreamReader, or a custom > reader that understands that encoding. > > * If the data is NOT externally typed, then you MUST rely on > the XML parser's autodetection ... pass an InputStream. This is all quite sound, and I was wrong to overlook the case of external charset information. > > Actually, it's doing what it's expected to: reading the native charset, > > CP-1252. (Unix JVMs use 8859-1 instead.) > > Those are actually system-specific defaults ... many localized versions > of those environments work differently. For example UNIX JVMs may well > use the "EUC-JP" coding in Japan, or MS-Windows the "Shift_JIS". Reasonable. > In fact, my own basic guidance is never to pass any sort of I/O stream > (InputStream -or- Reader!) to the parser; let the parser work from the > URI, if at all possible. It's normally quite possible, and it's a lot > less likely to handle the encodings wrong than application code!! This leads to an interesting question: what do various XML parsers do when fetching http: URIs that produce explicit charset declarations? Someone should try Aelfred, etc. and see if the header-level charset declaration is respected, overriding the internal encoding declaration. -- John Cowan http://www.ccil.org/~cowan cowan@c... Schlingt dreifach einen Kreis um dies! / Schliesst euer Aug vor heiliger Schau, Denn er genoss vom Honig-Tau / Und trank die Milch vom Paradies. -- Coleridge / Politzer xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|