[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Sorting Upper-Case first. Microsoft bug?

Subject: Re: Sorting Upper-Case first. Microsoft bug?
From: David Carlisle <davidc@xxxxxxxxx>
Date: Fri, 8 Aug 2003 10:39:07 +0100
david c carlisle
> Dr. Johnson and every lexicographer since has used case as the least
> significant, most rapidly varying element in ordering. The example I
> have in front of me from the Concise Oxford Dictionary lists daily -
> Dalmatian - dalmatic and I would not expect it to do anything else. 

Dictionaries are not really a good example to follow here as they don't
have to deal with all strings, it probably doesn't list
DAILY or dalmatioN at all, but xsl:sort has to deal with these things.

> When Dennis Ritchie devised C before 1978, strcmp() would give a sort
> order that would place Dalmatian first (assuming ASCII) but in those
> days most of us were still using uppercase-only i/o devices and not
> worried about such refinements. If we were, we used strcmpi().

ASCII ordering would put all the uppercase before all the lowercase:
ordering A B C a b c.
No one has suggested xsl:sort is specified as doing that, despite
several people giving that as a reason for not implementing xsl:sort as

> The world has moved on and the whole thrust of Unicode is to coerce the
> mechanical representation of text into natural linguistic usage, so
> Dr. Johnson wins. 

As I commented before, the discussion really isn't about the best way of
sorting. XSLT2 is far more flexible, and far more explictly system
dependent in this area, which is probably a good thing. The question is
about what the XSLT 1 spec says.

> There will be all sorts of interesting issues that arise in considering
> the natural ordering of words from different linguistic groups, not
> borrowings like yacht and pyjama, but with equal cultural weight. 

Yes, of course.

> I suspect you are in a minority of one and the unanimity of the XSLT
> processors suggests that the interpretation they have adopted is the
> correct one.

I wouldn't disagree with you that the evidence suggests that within a
relevant community I am in a minority, however given that the phrase
"lexicographic ordering" is (and has been for a century or so) totally
standard terminology used without comment in any mathematical work on
ordered sets (a field which covers a large part of the mathematical
literature) and is similarly standard terminology in any computer
science discussion of sorting, I wouldn't say that there is any
room for interpretation in the text of the XSLT 1 spec. It would take an
errata to change the text of the specification to justify the currently
implemented algorithms.

I can understand if lexicographers are annoyed if the term
"lexicographic ordering" doesn't describe an ordering that they
recognise as useful, as it is a purely mechanical ordering ignoring the
art of lexicography entirely, but on the other hand they should be used
to the idea that words get used by convention in ways not immediately
suggested by their etymology.


This e-mail has been scanned for all viruses by Star Internet. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Current Thread


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.