[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Use the xml:lang attribute to set the collation?

Subject: Re: Use the xml:lang attribute to set the collation?
From: Michael Kay <mike@xxxxxxxxxxxx>
Date: Mon, 07 Jan 2013 15:44:01 +0000
Re:  Use the xml:lang attribute to set the collation?
Roger, have you read up on this subject? It's very thoroughly covered in my book (XSLT 2.0 and XPath 2.0 Programmer's Reference 4th edition): see "collations" in the index, and especially pages 459 et seq on xsl:sort. And of course in many other places. I don't think the readers of this list necessarily want to follow every small step in your learning curve.

The choice of collation is made in the stylesheet (or other program), it is NOT a property of the data. There are various reasons for that decision, the main one being that when you publish a phone book, it's the users of the phone book whose requirements you are concerned with, not the nationality of the people whose names are listed in the book. So xml:lang in the data makes no difference. But a lang attribute on xsl:sort does make a difference.

To take a simple example where the choice of collation makes a difference,

<xsl:value-of select="'a' eq 'A'" default-collation=""http://saxon.sf.net/collation?ignore-case=yes"/>

will give different results from

<xsl:value-of select="'a' eq 'A'" default-collation=""http://saxon.sf.net/collation?ignore-case=no"/>

Choosing a collation based on language alone will not usually affect the result of the '=' operator, only '<' and '>', because the language-based rules are mainly designed to influence sort behaviour, and for good sorting behaviour you usually want to treat all strings as distinct.

Michael Kay
Saxonica

On 07/01/2013 15:28, Costello, Roger L. wrote:
Hi Folks,

Michael Kay wrote this response to a StackOverflow question [1]:

     Saxon's default collation is Unicode codepoint, which is fast
     but not smart. Setting lang="en" will immediately give you a
     smarter natural-language collation. There are then many
     options to refine it further.

QUESTIONS
1. Does Michael's response mean that, to set the collation, I can use the xml:lang attribute instead of the default-collation attribute?

2. Would you please give an example of a comparison where the result of the comparison is true when xml:lang="A" but false when xml:lang="B"? That is, what values would you place in here:

     <Test xml:lang="__">
         <xsl:value-of select=" '__' lt '__' " />
     </Test>

/Roger

[1] http://stackoverflow.com/questions/13052896/xslt-sort-edge-case-for-ascending-sort-by-element-name

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.