[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: RE: There is a serious amount of character encodingconvers

  • From: Chris Maloney <voldrani@gmail.com>
  • To: David Lee <dlee@calldei.com>
  • Date: Fri, 28 Dec 2012 14:45:28 -0500

Re:  RE: There is a serious amount of character encodingconvers
Roger,

Here is a classic post from XML.com that is right in line with the
topic of character encodings that you have been posting about
recently, titled "XML on the web has failed":
http://www.xml.com/pub/a/2004/07/21/dive.html

It takes some work to really grok the problems the author is
describing, but it is well worth it, I think, and may make your head
spin (or hurt, depending).

I'd be very interested to hear if any of the XML / character encoding
gurus on this list have any comments or links to updates to this
article (which was written in 2004).  I am not sure if the issues the
author describes have been remedied or not.

Chris


On Fri, Dec 28, 2012 at 12:17 PM, David Lee <dlee@calldei.com> wrote:
> ---------
>
> You are writing about character encoding conversions as text moves from
> point to point to point.
>
>
>
> Is there a parallel with markup? Are there markup conversions as XML moves
> from point to point to point?
>
>
>
> Are there lessons learned in the character encoding community that could be
> applied to the XML community?
>
>
>
> --------
>
>
>
>
>
> Markup is text and has the same problems (and solutions).
>
> If we could start over from scratch with what we know now there would be
> less problems.
>
>
>
>
>
> IMHO, my preferred solution is to stick to a single encoding everywhere (I
> vote for UTF8 ... as it handles all Unicode codepoints).
>
> The next step is to make sure *every single link in the chain* uses that
> encoding.
>
> This is amazingly difficult even in "modern" languages like Java where the
> default behavior of converting code points to strings is to use
>
> the *system default encoding* which is always an unknown.   Even in pure
> java you have to track every single point that a byte array is converted to
> a String and visa versa,
>
> and explicitly set the encoding.   (or guarantee the system encoding is
> correct).
>
> Then you have to manage all places the data enters and leaves the program
> and make sure it's in the right encoding.
>
> Then  you have to make sure all places that *store* the data (like a
> database) don't muck with it.
>
> XML Itself cannot solve this problem alone as an XML document is  only the
> payload ...  However the XML Tools tend to be a bit more mature about
> dealing with this.
>
> But not always.
>
>
>
> Maybe in another 30  years more we will have migrated all our tools to be
> consistant about encodings.
>
>
>
>
>
> ----------------------------------------
>
> David A. Lee
>
> dlee@calldei.com
>
> http://www.xmlsh.org
>
>
>
>
>
>


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.