[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: RE: There is a serious amount of character encodingconvers
Argh. Â Let's try that again: > I'd be very interested to hear if any of the XML / character
> encoding gurus on this list have any comments
> or links to updates to this article (which was written in 2004).
>  I am not sure if the issues the author describes have
> been remedied or not. In 2004, UTF-8 was a noise encoding on the Web: see <http://googleblog.blogspot.com/2012/02/unicode-over-60-percent-of-web.html>.  As of the beginning of 2012, it was more than 60% of the documents visible to Google.  If you count pure ASCII documents as UTF-8, which you can do, it's up at 80%. If the trend line continues, which is of course not something you can count on, I'd expect to see UTF-8 rise by another 5% or so, though perhaps pure ASCII will drop by about half the same amount leaving the total situation nearly unchanged.  In short: More than 80% of the Web is now UTF-8 one way or another, and less than 10% is Latin-1 and related encodings, leaving just about 10% for all the rest. (UTF-16 is less than 0.1%, according to Mark Davis.)  Not exactly a ringing endorsement  for "publish in any encoding you want" (per the article), is it.
On Fri, Dec 28, 2012 at 2:45 PM, Chris Maloney <voldrani@gmail.com> wrote: Roger, GMail doesn't have rotating .sigs, but you can see mine at http://www.ccil.org/~cowan/signatures
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|