accents in file names

Mihai Donțu mihai.dontu at gmail.com
Mon Feb 16 12:32:00 PST 2009


On Friday 13 February 2009, Chuck Swiger wrote:
> On Feb 12, 2009, at 2:50 PM, Wojciech Puchar wrote:
> >>> accented letter to my freebsd box, the accented letter simply
> >>> disappear.
> >>
> >> UFS supports 8-bit characters except for "/" and "\0", but you also
> >> need to run a terminal with UTF8 support and use a correct font to
> >> view such things.
> >
> > why? i use ISO-8859-2
>
> You've answered "why" when you state that you set up a locale which
> supports ISO Latin-X charset.  If you are running in the default C/
> POSIX locale, using the US-ASCII character set and a font that only
> knows about 7-bit ASCII glyphs, then you won't get accented characters.
>
> > UFS doesn't deal with encoding at all, just store what you give
>
> That's right, which means you need to use filenames encoded in UTF8
> rather than in arbitrary Unicode.

UTF-8 is what we prefer these days, but the filesystem can handle anything 
that is ASCII compatible (like you said: Shift_JIS, EUC-JP etc.).

Now, I assume Daniel was copying "filé.txt" from a non-UFS (Windows box, 
FAT32, NTFS etc) filesystem to UFS, because this is the only case I can think 
of and in which such a problem might appear.

> People in Asia tend to want UTF-16 
> or UTF-32 encoding (although historical encodings like Big5, Shift-
> JIS, and now GB18030 for China are still rather popular, and those are
> multibyte encodings), and things like gcc's implementation of
> widechars or Python are standardizing on UTF-32.

-- 
Mihai Donțu


More information about the freebsd-questions mailing list