[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: RE: There is a serious amount of character encodingconvers
Roger, Here is a classic post from XML.com that is right in line with the topic of character encodings that you have been posting about recently, titled "XML on the web has failed": http://www.xml.com/pub/a/2004/07/21/dive.html It takes some work to really grok the problems the author is describing, but it is well worth it, I think, and may make your head spin (or hurt, depending). I'd be very interested to hear if any of the XML / character encoding gurus on this list have any comments or links to updates to this article (which was written in 2004). I am not sure if the issues the author describes have been remedied or not. Chris On Fri, Dec 28, 2012 at 12:17 PM, David Lee <dlee@calldei.com> wrote: > --------- > > You are writing about character encoding conversions as text moves from > point to point to point. > > > > Is there a parallel with markup? Are there markup conversions as XML moves > from point to point to point? > > > > Are there lessons learned in the character encoding community that could be > applied to the XML community? > > > > -------- > > > > > > Markup is text and has the same problems (and solutions). > > If we could start over from scratch with what we know now there would be > less problems. > > > > > > IMHO, my preferred solution is to stick to a single encoding everywhere (I > vote for UTF8 ... as it handles all Unicode codepoints). > > The next step is to make sure *every single link in the chain* uses that > encoding. > > This is amazingly difficult even in "modern" languages like Java where the > default behavior of converting code points to strings is to use > > the *system default encoding* which is always an unknown. Even in pure > java you have to track every single point that a byte array is converted to > a String and visa versa, > > and explicitly set the encoding. (or guarantee the system encoding is > correct). > > Then you have to manage all places the data enters and leaves the program > and make sure it's in the right encoding. > > Then you have to make sure all places that *store* the data (like a > database) don't muck with it. > > XML Itself cannot solve this problem alone as an XML document is only the > payload ... However the XML Tools tend to be a bit more mature about > dealing with this. > > But not always. > > > > Maybe in another 30 years more we will have migrated all our tools to be > consistant about encodings. > > > > > > ---------------------------------------- > > David A. Lee > > dlee@calldei.com > > http://www.xmlsh.org > > > > > >
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|