Filename containing French characters ?

Modulok modulok at gmail.com
Mon May 23 15:50:50 UTC 2011


Short answer, use a glob pattern. Assume I have a file named 'à fichier.txt':

    ls -l
    -rw-r--r--  1 Modulok  Modulok       12 May 23 09:01 ?? fichier.txt

    mv ?\ fichier.txt aFile.txt

Long answer, for those who want to follow along and fix their terminal to
display UTF-8, keep reading...

Step 1: Make a funky file to play along with this min-tutorial:
===============================================================

Create a text file with an editor that supports non-ASCII characters. I
created a file named 'filename' which containing this (no newline!):

        à fichier.txt

Step 2: Create the actual file with content
===========================================

I used echo and cat like so in the tcsh shell:

        echo "hello world" > "`cat filename`"


Step 3: Show the file in ls
===========================

As you can see below, the first character of the filename is displayed as two
question marks. This is the terminal's way of showing filenames that it cannot
display correctly. There are two question marks, because this is a two-byte
character. This does *not* mean the filename starts with a literal question
mark:


    -rw-r--r--  1 Modulok  Modulok       12 May 23 09:01 ?? fichier.txt

Step 4: (optional) Fix the terminal
===================================

At this point, let's just fix the terminal so that UTF-8 characters are
displayed correctly. We want to see the French accented 'à', and not a bunch of
question marks. To do this, you edit '/etc/login.conf' as root. Add two lines
at the bottom of the 'default' section. My default section now looks like this:


    default:\
            :passwd_format=md5:\
            :copyright=/etc/COPYRIGHT:\

            ...and so on...

            :charset=en_US.UTF-8:\
            :lang=en_US.UTF-8:

If you're a French operation yours should probably look like this instead:

    default:\
            :passwd_format=md5:\
            :copyright=/etc/COPYRIGHT:\

            ...and so on...

            :charset=fr_FR.UTF-8:\
            :lang=fr_FR.UTF-8:

I'm not certain on these for all countries, but the above examples work. We
then need to rebuild the actual login database. Execute the following command
as root:

    cap_mkdb /etc/login.conf

This generates /etc/login.conf.db from /etc/login.conf. Now log out and then
back in!


Step 5: Back to the funky file
==============================

You should now see the actual accent characters correctly in the terminal.
(Assuming your terminal supports this):

    -rw-r--r--  1 Modulok  Modulok       12 May 23 09:01 à fichier.txt

In some ternimals, we cannot type these characters. So you can access the
filename through a shell glob pattern. In most shells, the glob pattern '?'
matches any single character. The forward slash escapes the space in the
filename.

    mv ?\ fichier.txt aFile.txt


Hope this helps (and doesn't get too mangled.)
-Modulok-


More information about the freebsd-questions mailing list