Re: What's the locale for system files (e.g. /etc/fstab)?

From: Warner Losh <imp_at_bsdimp.com>
Date: Thu, 24 Mar 2022 15:31:33 UTC
On Thu, Mar 24, 2022, 9:20 AM Rodney W. Grimes <
freebsd-rwg@gndrsh.dnsmgr.net> wrote:

> > On 23 Mar 2022, at 11:51, Piotr Pawel Stefaniak wrote:
> > > mount: make libxo support more locale-aware
> > >
> > >    "special", "node", and "mounter" are not guaranteed to be encoded
> > > with
> > >    UTF-8. Use the appropriate modifier.
> > >
> > > -       xo_emit("{:special}{L: on }{:node}{L: (}{:fstype}",
> > > sfp->f_mntfromname,
> > > +       xo_emit("{:special/%hs}{L: on }{:node/%hs}{L: (}{:fstype}",
> > > sfp->f_mntfromname,
> >              sfp->f_mntonname, sfp->f_fstypename);
> >
> > This recent "mount" patch highlights a libxo-related problem for which I
> > don't have a solution:
> >
> > There are several files for which the encoding is not known.  Since
> > locale is user specific, we don't know how to interpret the contents of
> > /etc/fstab.  It's assumably been encoded with the format of the user who
> > wrote it, but that information is lost.
>
> Since you say "locale is user specific" it makes me want to say that
> this should come from the environment set by "default:" in /etc/login.conf,
> no need for a new file or anything special.
>

Config files, like fstab, have no locale and parsing them with a locale
leads to errors, even when the user or the system has a nondefault locale.

>
> > Put more generally, there's not a system-wide place which declares the
> > encoding for system files, which leads to this problem where we
> > interpret files from one user's locale using another user's locale.
>
> Well /etc/login.conf *IS* a system wide declaration of this type of
> stuff, both lang= and charset= are declared there.
>

Since system wide files like yhese are always parsed without a locale, this
information is correct, but I'm not sure how it applies.

It is always  C.UTF-8. Anything else may, or may not, work based on
accidents of coincident encoding. Not everything can change locales, and
the fstab and other parsing routines in libc assume C.UTF-8 or even just
the ascii-7/8 subset.

>
> > One solution would a symlink in /etc that "points to" the name of the
> > current system-wide locale name.
> >
> > % ls -Fl /etc/locale
> > lrwxr-xr-x  1 root  wheel  7 Mar 23 15:42 /etc/locale@ -> C.UTF-8
>
> grep lang /etc/login.conf:
>         :lang=C.UTF-8:
>         :lang=ru_RU.UTF-8:\
>
> Probably what you want?
>

You can get this with the locale routines, no? No need for grep.

Warner

>
> > (Or "/etc/system.locale" ?)
> >
> > If the symlink doesn't exist, would "C.UTF-8" be a suitable default
> > moving forwards?  It certainly would not be backwards compatible, since
> > an existing fstab could have non-UTF-8 strings in it, encoded with the
> > locale of the user who touched the file.  But there's really no
> > backwards compatible solution, given that there's no guarantee that (for
> > any specific FreeBSD system) all system files were written with the same
> > locale.  Fun, eh? ;^)
> >
> > Opinions, thoughts, please?
> >
> > Thanks,
> >   Phil
> >
> >
>
> --
> Rod Grimes
> rgrimes@freebsd.org
>
>