Turkish character sorting on PostgreSQL

Ismail YENIGUL ismail at EnderUNIX.org
Sat Dec 22 09:09:27 PST 2007


Hello Ivan,

Here is the test result. It seems that the problem is on FreeBSD
(6.2) . Because ö and ş are before then z in Turkish alphabet.

# cat a.c
#include <locale.h>

int main() {

        setlocale(LC_COLLATE, "tr_TR.ISO8859-9");
        printf("%d\n",strcoll("ö", "z"));
        printf("%d\n",strcoll("ü", "z"));
}
ftpfreebsd[~]# ./a
124
130
ftpfreebsd[~]# 

By the way,  LC_COLLATE is link to the  ../la_LN.US-ASCII/LC_COLLATE
in  /usr/share/locale/tr_TR.ISO8859-9 directory.  Does this mean that
LC_COLLATE is missing for tr_TR.ISO8859-9 ?

# ls -al /usr/share/locale/tr_TR.ISO8859-9/
total 14
drwxr-xr-x    2 root  wheel   512 Jul  9 15:32 .
drwxr-xr-x  157 root  wheel  4096 Dec  4  2006 ..
lrwxr-xr-x    1 root  wheel    28 Jul  9 15:32 LC_COLLATE -> ../la_LN.US-ASCII/LC_COLLATE
lrwxr-xr-x    1 root  wheel    26 Jul  9 15:32 LC_CTYPE -> ../la_LN.US-ASCII/LC_CTYPE
-r--r--r--    1 root  wheel    18 Jul  9 15:32 LC_MESSAGES
-r--r--r--    1 root  wheel    34 Jul  9 15:32 LC_MONETARY
-r--r--r--    1 root  wheel     8 Jul  9 15:32 LC_NUMERIC
-r--r--r--    1 root  wheel   352 Jul  9 15:32 LC_TIME


and there is no file spesicific  to the tr_TR.ISO8859-9 in   /usr/src/share/colldef/

# ls /usr/src/share/colldef/
Makefile                el_GR.ISO8859-7.src     la_LN.ISO8859-15.src    map.ISO8859-1           ru_RU.CP1251.src
README                  en_DK.example           la_LN.ISO8859-2.src     map.ISO8859-13          ru_RU.CP866.src
be_BY.CP1131.src        es_ES.ISO8859-1.src     la_LN.ISO8859-4.src     map.ISO8859-15          ru_RU.ISO8859-5.src
be_BY.CP1251.src        es_ES.ISO8859-15.src    la_LN.US-ASCII.src      map.ISO8859-2           ru_RU.KOI8-R.src
be_BY.ISO8859-5.src     et_EE.ISO8859-15.src    lt_LT.ISO8859-13.src    map.ISO8859-4           sl_SI.ISO8859-2.src
bg_BG.CP1251.src        hi_IN.ISCII-DEV.src     lt_LT.ISO8859-4.src     map.ISO8859-5           sr_YU.ISO8859-5.src
ca_ES.ISO8859-1.src     hy_AM.ARMSCII-8.src     map.ARMSCII-8           map.ISO8859-7           sv_SE.ISO8859-1.src
ca_ES.ISO8859-15.src    is_IS.ISO8859-1.src     map.CP1131              map.KOI8-R              sv_SE.ISO8859-15.src
cs_CZ.ISO8859-2.src     is_IS.ISO8859-15.src    map.CP1251              map.KOI8-U              uk_UA.CP1251.src
de_DE.ISO8859-1.src     kk_KZ.PT154.src         map.CP866               map.PT154               uk_UA.ISO8859-5.src
de_DE.ISO8859-15.src    la_LN.ISO8859-1.src     map.ISCII-DEV           pl_PL.ISO8859-2.src     uk_UA.KOI8-U.src


Thursday, December 20, 2007, 12:44:34 PM, you wrote:

> Ismail YENIGUL wrote:
>> Hello,
>> 
>> I am using PostgreSQL 8.2.5 on FreeBSD 6.2. But I have a problem with
>> sorting Turkish characters. They are listed after "z" character.
>> I initialized the PostgreSQL with the following values:
>> 
>> initdb -E UNICODE --locale=tr_TR.UTF-8 and

> Unicode (UTF-8) collations (sorting) don't work on FreeBSD. You can use
> PostgreSQL 8.1 and the "ICU" patch for it.

>> initdb -E LATIN5 --locale tr_TR.ISO8859-9

> This could work, if the locale is properly defined in the system locale
> database. Try creating a small C program that sorts your strings using
> strcoll() to verify this - if the small C program works, it's a
> PostgreSQL problem.





-- 
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+ http://www.enderunix.org/ismail  http://www.endersys.com.tr       +
+ EnderUNIX SDT @ Tr               Endersys Consultancy Ltd.        +
+ ismail ~ enderunix.org           ismail.yenigul ~ endersys.com.tr +
+ Volunteer, Core Team Member      Project Manager                  +
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

TCP/IP ve Ağ güvenliği kitabının 2. baskısı çıktı!
http://dukkan.acikakademi.com



More information about the freebsd-questions mailing list