[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

UTF-8 or ? for SML (was: Re: Feeler for SML (Simple Markup Language))

  • From: Tony Graham <tgraham@m...>
  • To: <xml-dev@i...>
  • Date: Sat, 13 Nov 1999 14:11:25 -0400 (EST)

sml utf 8
At 13 Nov 1999 15:46 -0000, Richard Anderson wrote:
 > But UTF-8 can support "foreign" characters so I dont see the argument for
 > having UTF-16 too.  Also, generally speaking UTF-8 encoding results in
 > smaller output for most cases.

Different people have different ideas of what constitutes "foreign".

For the majority of the characters in the Unicode Standard, UTF-8 uses
three bytes per character.  However, for the US-ASCII characters, it
uses only one byte per character.

For all characters in the Unicode Standard, UTF-16 uses two bytes per
character.

Whether a given file is less bytes as UTF-8 or UTF-16 is largely a
function of the proportion of unaccented Latin characters in the file.

Moreover, most legacy encodings for a single script use one byte per
character, although Chinese, Japanese, and Korean encodings use two or
more bytes per character.  UTF-8, therefore, isn't as efficient as the
legacy encodings of most scripts.  (Its advantage is that it can
represent more scripts than any legacy encoding.)

Regards,


Tony Graham
======================================================================
Tony Graham                            mailto:tgraham@m...
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9632
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@i... the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@i... the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@i...)



PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.