[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: xml over http - RFC 3023
Hi Rick, > The out-of-band signalling of character encoding is a fundamentally broken > idea, because there are no mechanisms for programs which generate data to > memoize the character encoding used that can then feed the rest of the > food-chain. How about the BOM - that's one way isn't it? I wonder if a similar ignorable byte sequence could be added to the start of all byte sequences to indicate the encoding of what's coming. >> At the moment it all seems pretty complicated... > It is not complicated. Use application/xml > > If you do find intermediate web systems that implement the ASCII default or > the IS8859-1 default as anything other than 8-bit clean for text/xml submit > a bug report. I'm dealing with RSS feeds from all over the world, so it's: - 3 different types of feeds - multiple languages, multiple encodings - embedded inconsistenly escaped html, or cdata sections, or both - and even, use of entities without even including the doctype, so it doesn't even parse without help It is possible to reject some of the feeds, but other readers accept them so this one needs to at least match them before taking the moral high ground (and it's not too hard to code around the problems). So this is a real test of XML on the web. The complicated part I was referring to is reading the bytes from the http input stream in the right encoding: - extract the encoding from the contenttype - if its not there read the first few bytes of stream in us-ascii and then extra the encoding from the prolog - if its not there use utf-8 - hope that actual encoding of the file and the encoding you've discovered match ...and that's not even completely correct as far as I understand. So when you say: "It is not complicated. Use application/xml" I don't get it, what am I missing? I would've thought the webserver would be aware that it was serving xml and take of it - it could extract the encoding from the xml prolog and ensure the file was served with that (maintaining it however it liked)... it seems odd that the client should go through this process every time. thanks -- Andrew Welch http://andrewjwelch.com Kernow: http://kernowforsaxon.sf.net/
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|