[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] expat and encodings
Can someone clarify the issue of character encodings for me - I think this is an expat issue, but it may be a more general thing. I'm trying to save/load text that might contain accented characters (>127). Running on Windows95. I realise that when writing XML, I either have to convert such characters to "&#xxx;" form, or note that the file format encoding is "iso-8859-1", otherwise the XML parser (expat)objects when subsequently reading the file. The snag is that whether the file has utf-8 or iso-8859-1 encoding, the text the application receives from the parser seems to be always utf-8. I've tried specifying "iso-8859-1" as the encoding to the XML_CreateParser() call, but this seems to have no effect (I guess the parameter actually overrides the default (rtf-8) file encoding, rather than specifying the encoding the client would like to see). The questions... Is my understanding correct - does expat feed UTF-8 text to clients when parsing? Can expat be asked to feed clients iso-8859-1? If the client must convert manually, are there any helper functions in expat/xmltok? If I use the unicode build of expat, does it feed utf-8, unicode or utf-16? Many thanks, Steve Kearon FineLine Software xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|