[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Correct xml:lang value for Pinyin Chinese vs Simplified Ch

  • From: Rick Jelliffe <rjelliffe@allette.com.au>
  • To: Lech Rzedzicki <xchaotic@gmail.com>
  • Date: Tue, 28 Feb 2012 03:25:34 +1100

Re:  Correct xml:lang value for Pinyin Chinese vs Simplified Ch
Are you sure you have the right terms here?  Pinyin is not pidgen. And
it usually has no accents. (If it has accents, in particular macrons,
it may not be standard Pinyin, which is not to say that it might not
be an old or extended Pinyin.)

Language codes are in flux: the three letter codes and the two letter
codes have different approaches. The two letter codes plus regional
variant may still be safest.  So first you need to determine the
region: is your simplified text from PRC or Singapore?

Assuming it is from PRC, then  the language code  zh-CN should be enough AFAIK.

Note that there is (or should be) no need to specify anything about
the script if you are just marking up existing text. @xml:lang
specifies the language, and the script only indirectly because a
language+region often has a standard or characteristic orthography:
the general script being used is obvious from the characters
themselves.

So you could use  xml:lang="zh-CN"  for all the three cases you
mention. If you wanted to give more of a hint, you could try
xml:lang="zh-CN-pinyin" or  "zh-Latn-CN-pinyin"  for the standard
pinyin,  and  xml:lang="zh-CN-pinyin-adhoc" or "zh-Latn-CN-adhoc" for
the non-standard one (where "adhoc" is some phrase you pick to
indicate an extended pinyin or mystery format.)

(I suspect the transliterated Chinese with accented roman characters
would not be a legitimate  zh-Latn-CN  (I'd expect John Cowen to be on
top of this) but if it were, then that would probably be the best for
the non-standard transliteration )

If you want to mark up your text so that screen readers can read it,
then find the website for the screen reader, contact the developers,
and ask them. I doubt if the non-standard pinyin would have specialist
readers that can understand it in any case (though IIRC there was a
reader that understood 1,2,3,... tone digits in with pinyin or
bopomofo.)

For more info, see
 http://www.alvestrand.no/pipermail/ietf-languages/2008-September/008322.html
You could track down the current IANA registrations for
http://www.ietf.org/rfc/rfc4646.txt too, I guess.

Cheers
Rick Jelliffe


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.