Why no non-latin TODIGIT mappings in UTF-8.src ?

Wolfgang Zenker wolfgang at lyxys.ka.sub.org
Mon May 28 18:17:36 UTC 2007


* Andrey Chernov <ache at freebsd.org> [070528 14:49]:
> On Mon, May 28, 2007 at 02:34:56PM +0200, Wolfgang Zenker wrote:

>> What would be a good place to read
>> up about how much can be localised with locales and how much of it we
>> currently (and maybe in the near future) support?

> The Open Group Base Specs Issue 6
> http://www.opengroup.org/onlinepubs/009695399/toc.htm

So, as 7.3.1 says, in the "POSIX locale", which appears to be otherwise
known as the "C" locale, only '0' to '9' can be defined as being in class
digit. Because we use UTF-8.src as source for the "C" locale, we can not
add definitions for digits in other scripts, right?

In "a locale", which appears to be the generic case now, we are only
allowed to define the digits <zero> to <nine> in the digit class. The
digits '0' to '9' from the "portable character set" (= ASCII?) would be
automatically included in the class.

So if we have a locale using a non-latin script that happens to have its
own "digit" characters, we can not use the UTF-8.src for the LC_CTYPE
definitions but would best work with a copy and add DIGIT mappings for
the digit characters in the script used? Or are <zero> to <nine>
again fixed to be the ASCII codes '0' to '9'?

Wolfgang


More information about the freebsd-i18n mailing list