Character set conversion, locales, UTF-8, etc

Polytropon freebsd at edvax.de
Mon Nov 5 19:34:37 UTC 2012


On Mon, 5 Nov 2012 14:05:17 -0500, grarpamp wrote:
> >> As an aside, why does FreeBSD seem to default to the above locale
> >> instead of say, en_US.UTF-8 ?
> >
> > FreeBSD's file system does not default to any locale, as far as I
> > know. The system is "agnostic" to what the characters in the file
> > name mean or what symbol they should represent.
> 
> Sure the fs is just binary, then viewed and written through
> the mask of the selected langauge layer I think.

Yes, that seems to be the case.



> > There isn't much you can do on file system level except renaming
> > the files: write a program that reads the file names according
> > to the preferred interpretation and write new names for them,
> 
> I'll read more on language to see if I can reverse that and
> recover them or just replace with X's.

For X it's importat to have the required language variables
set and the fonts containing the characters which are represented.



> I was looking mostly for a tool that would show me what a
> filename or data looks like in hex, octal, and different
> selected encodings. Doing it by hand is slow. I'll check
> ports again.

The system already brings such a tool: od (octal, decimal,
hex, ASCII dump). For example:

	% ls -w müslifraß.txt | od -h
	0000000      fc6d    6c73    6669    6172    2edf    7874    0a74

This is on en_US.ISO8859-1 (german special characters will be
displayed properly as 1-bytes both in X and in console mode).



-- 
Polytropon
Magdeburg, Germany
Happy FreeBSD user since 4.0
Andra moi ennepe, Mousa, ...


More information about the freebsd-questions mailing list