CLDR import for src/share/*def definitions

Wolfgang Zenker wolfgang at lyxys.ka.sub.org
Mon Jul 6 18:01:11 UTC 2009


Hi,

* Edwin Groothuis <edwin at mavetju.org> [090702 00:37]:
> I have been playing with the CLDR database to see if I can get the
> monetary, time, messages and numerical definitions right. The CLDR
> is in UTF-8, I use iconv to translate to other charactersets.

> So far most of it is fine, except (subset of issues):

> - A couple of languages are not known (es_FR, es_IT)

what do you mean by "not known"? Both locales specify a spanish language
locale, one for use in  France and one in Italy. There might even be no
different language contructs from es_ES, just different ways to format
dates ore something like that.

> - A couple of languages have a different abbrevation
>         no_NO   -> nb_NO nn_NO
>         *_YU    -> *_RS

You best ask a norwegian about the locales to use for Norway; for the
second line I'ld go with *_RS, as the former Yugoslavia hac split into
aming others the Republic of Serbia (which i assume the code RS is supposed
to mean).

> - A couple of charactersets are not known to iconv:
>         (CP1131 ISCII-DECV)

Never heard of them.

> - A couple of translations went wrong:
>         Writing to fi_FI in ISO8859-1
>         Could not convert currency_symbol from UTF-8 to ISO8859-1

Thats because Finland uses the Euro and the Euro-sign does not exist
in ISO8859-1; you have to use ISO8859-15 if you want the € currency
symbol in an ISO8859-* charset.

> - It is not clear what the difference between "Long month names (as
>   in a date)" and "Long month names (without case ending)" is. (could
>   be my language problem :-)

I don't know either; could you give an example where the two are different?
Preferably in a language where we find some speaekers here :-)
I do speak english, german, latin and some arabic, if that is of any help.

> The biggest problem so far is not a technical: WHich data is more 
> authoritative - The one in the CLDR database or the one we have 
> collected over the years from various sources and people?

I _think_ the CLDR has been relying on people coming forward with
information the same way that we have, so I consider neither _the_
authoritative source. Best to just list conflicting entries and ask
around for locals on the lit that could help resolve conflicts.

> Another problem I'm facing is that there is little documentation 
> on what the format of the *def/ files is, it is mostly a UTSL 
> approach in lib/libc/locale, but that doesn't show me neither if I 
> can safely replace (for example in uk_UA)
>      # yesstr
>     -<E2><D0><DA>
>     +<E2><D0><DA>:<E2>:<C2><B0><BA>:<C2>:yes:y:YES:Y

Sorry, no clue, can't help you here.

> So euhm... Is there anybody who wants to give their opinion or
> wisdom about things, please speak up, I need it :-)

I don't know if it was of any help, but here you got my 2¢

Wolfgang


More information about the freebsd-i18n mailing list