Why en_US.UTF-8 locale consider a < A?

Mark Martinec Mark.Martinec+freebsd at ijs.si
Thu Mar 9 09:26:44 UTC 2017


2017-03-08 16:59, Matthias Apitz wrote:
> I recently came across with a related problem and have two questions
> (unresolved until now):
> 
> 1.
> Using sort, reading the man page of it, it should be sufficient to
> set LC_COLLATE correctly. It seems that setting LANG (or unsetting it)
> changes the sort Order, why?

The search/priority order is: LC_ALL -> LC_COLLATE -> LANG,
so in absence of LC_COLLATE and LC_ALL, the LANG determines
the collation.


http://pubs.opengroup.org/onlinepubs/7908799/xbd/envvar.html :

The values of locale categories are determined by a precedence order; 
the first condition met below determines the value:

  If the LC_ALL environment variable is defined and is not null, the 
value of LC_ALL is used.

  If the LC_* environment variable ( LC_COLLATE, LC_CTYPE, LC_MESSAGES, 
LC_MONETARY, LC_NUMERIC, LC_TIME) is defined and is not null, the value 
of the environment variable is used to initialise the category that 
corresponds to the environment variable.

  If the LANG environment variable is defined and is not null, the value 
of the LANG environment variable is used.

  If the LANG environment variable is not set or is set to the empty 
string, the implementation-dependent default locale is used.



> 2.
> Speaking about German Umlauts, should they be treated as their normal
> letters, i.e. 'ä' is like 'a', as one can read in Wiki, or how they are
> sorted exactly?



   Mark


More information about the freebsd-hackers mailing list