[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: parsing and translating xml:lang attribute

Subject: Re: parsing and translating xml:lang attribute
From: Mike Brown <mike@xxxxxxxx>
Date: Tue, 24 Oct 2000 15:48:31 -0600 (MDT)
xml check attribute xml lang
Matthias O. Will wrote:
> > <Language xml:lang="ge"/>

"de", right? :)

> but Xalan complains while parsing and produces the following error message:
> 
> > Parser error: Attribute "xml:lang" is required and must be specified for
> > element type "Language"

Is Xerces your parser? There doesn't seem to be anything wrong ... are you
sure it's complaining about that specific instance of <Language>? (check
the line number)

> The second issue is that the values for this attribute are conforming to
> the two-digit language abbreviations according to ISO 639, but my target
> DTD uses three-digit language strings according to ISO 639-2 (e. g. 'de'
> would be translated into 'ger'). I do have a list of both, but I wonder how
> to technically best achieve the mapping using XSL.

xml:lang values must be RFC 1766 'language tags' ('tag' being a most
unfortunate choice of word in an XML context... I prefer 'identifier').
RFC 1766 mandates, essentially, that if the identifier is just 2
characters, or if the 3rd character is '-' then the first 2 characters
must be an ISO 639:1988 2-letter language code. The author recently
clarified that the intent was to refer to ISO 639:1988 and its successors,
so you should be using the most up-to-date list of 2-letter language
ccodes from ISO 639-1. RFC 1766 does not allow 3-letter codes at all. It
was a little short-sighted in this regard and is being revised to address
this issue (and the fact that ISO 639-2 codes are far more complete!)

...so if you are intending to put 3-letter codes in an xml:lang value in
the target document, then you're wrong to do so :)

Anyway, to answer your question:

<?xml version="1.0" encoding="utf-8"?>
<!-- langCodeMap.xml -->
<langCodeMap>
  <langCode iso639-1="de" iso639-2="ger"/>
  <langCode iso639-1="en" iso639-2="eng"/>
  ...
</langCodeMap>

and in the XSLT...

<xsl:variable name="langCodes" select="document('langCodeMap.xml')/langCodeMap/langCode"/>
<xsl:variable name="langIn" select="Language/@xml:lang"/>
<LanguageOut xml:lang="{$langCodes[@iso639-1 = $langIn]/@iso639-2}"/>

There are of course various ways to do it.. this is just one.

I question the use of xml:lang on an element called 'Language' though.
xml:lang identifies a language that the element content is in; it isn't
supposed to be a substitute for the content itself.

For example,

<Language xml:lang="en">German</Language>
<Language xml:lang="de">Deutsch</Language>


   - Mike
____________________________________________________________________
Mike J. Brown, software engineer at         My XML/XSL resources:
webb.net in Denver, Colorado, USA           http://www.skew.org/xml/


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.