Unicode-based FreeBSD

Tz-Huan Huang tzhuan at csie.org
Tue Aug 26 02:36:11 UTC 2008


Hi,

On Tue, Aug 26, 2008 at 8:54 AM, Svavar Lúthersson <svavar at kjarrval.is> wrote:
> Alexander Churanov wrote:
>>
> I am not against your idea of adding the Unicode support in general, just
> that it doesn't go far enough. I did not misunderstand your point that it
> uses the same set but we would end up in the same situation when it will be
> time to add display and writing support for UTF-16 and UTF-32 although it
> would be slightly easier since we do not have to make an actual conversion
> of existing characters. Still we have to think about the programs that only
> "upgrade" to UTF-8 and have to go through yet another change to support
> UTF-16 or UTF-32. It is much easier in the long run to just go all the way
> to UTF-32 to begin with. Going to UTF-8 might fix some of the character
> issues but we would be in the same shoes when it comes to characters which
> are in -16 and -32 but not in -8. I am not a user of X in FreeBSD (but soon,
> I hope) so my FreeBSD environment is limited to the console.

How do you define ``support''?

If you mean software-level support, vim supports UTF-16, firefox
supports UTF-16/UTF-32, perl supports UTF-16/UTF-32, etc.

If you mean system-level support, there are two cases:

1. The system internal text representation is still in UTF-8, just add
UTF-16/32
support for terminal, stdin/stdout/stderr, etc. I think it's not so
hard (I might be
wrong because I don't know terminal at all) but I don't see any reason to set
locale to UTF-16 or UTF-32.

2. The system internal text representation is changed to UTF-16 or UTF-32.
This is another story and I have no comment on it.

> Windows (XP) displays the Icelandic alphabet with no problems at the command
> prompt. I also have no problems typing it. Somebody else has to try to see
> if other charsets, like Russian or Chinese, work there as well.

Windows is another case. Windows use Unicode, but not in cmd.exe.
The cmd.exe is localized program thus the language used in cmd.exe is
depended on the language version of your windows. For example,
traditional chinese version of windows use CP950 as localized character
set, it is almost the same as zh_TW.Big5.

In cmd.exe, you cannot display two language (Chinese and icelandic for
example) at the same time because the Unicode is not supported here.

Regards,
Tz-Huan


More information about the freebsd-current mailing list