From nobody Thu Mar 24 19:12:10 2022 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 3F48E1A2680B for ; Thu, 24 Mar 2022 19:12:24 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-vk1-xa31.google.com (mail-vk1-xa31.google.com [IPv6:2607:f8b0:4864:20::a31]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4KPZbq2rRGz3q90 for ; Thu, 24 Mar 2022 19:12:23 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-vk1-xa31.google.com with SMTP id e7so3085358vkh.2 for ; Thu, 24 Mar 2022 12:12:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=/4NnRAf1kecyoq+Psv47A8NN/m+ktL2Eq8EcP2sWYjw=; b=ez8bbcb9QyRVnqXIY5LOIQ6I01N2yoyXj1zCqrX7zZJhx9voOoilpIywm1d2yXsN/V PgE0U+YTe+zanVOx5NJX59PQbOxLB4CNQrjuKD9tf29QBXKaVpSClo+j3R1rK4lle7aC 1CbEKIHyRp1vvFS3OhAfHZpeu+KHrcRxp2PaG/w8YFGnpYHIL16Dx9uN890yDXSbD4RI /ec9wLiM8kketLdlk5fpEeepyaYiVX+dt62jlVmwy91GYA0cwiUUSlYbS5Ro/U5XMIHQ hvoNKsif2biCIyeppA5GTlqLaFchEEYuPoy1QkMSkALiFkkWa0w1MND2MUs+lsDRmwOy AgSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=/4NnRAf1kecyoq+Psv47A8NN/m+ktL2Eq8EcP2sWYjw=; b=CA91vi+e5nUib5VLPHfxMn9nhJY+PjGJ2E/FHGzLTXpeIto077a5E0dD+9WMLxP+g8 gh5JBI65JyXJ0Yk26ydz030Y8wvepBWElU8NRc7AAyBVXMRNKTcpBjvVRxTr12ASd+1d bnMDYjtocHx7R/2LpLbAiUrHmBi+gSCwd3/hq2hOz1b33scsRdZLdVRk7Ba3bwSBvxGB nxNDtyRVtNd3U26bxYRA8CamtCYTl6nGHqxet+HBAemaf4z2UnIU+o/qcam6U+OMwspE OsHe3dRliiwV2VnOM1mJoCf+muGj9dY07KIlYDustiziUpOFcsxp+p+OTP70XT9HQQK1 RmrA== X-Gm-Message-State: AOAM532LwEJWp4bHXoad7WW4OwHF5MkcmhNJz3tqyeRtQlMoc+4/zlRq ox8XSvSjyG5DeKD5a/pUrkSr4rU6uD18swzzyqTYjQ== X-Google-Smtp-Source: ABdhPJwf1rYEM+amgUqCl+wwMj3NkXo1UxXNtzWYIN2Mzx1ffknI9lnFbKRPWTb/yBhj6c5yrUZKBWCQlIKpNv4RckQ= X-Received: by 2002:a05:6122:2229:b0:32d:1642:b58b with SMTP id bb41-20020a056122222900b0032d1642b58bmr3293799vkb.27.1648149142368; Thu, 24 Mar 2022 12:12:22 -0700 (PDT) List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org MIME-Version: 1.0 References: <70B211BB-15BA-47A4-8F9C-C833AA8C1EAA@freebsd.org> <202203241519.22OFJ3Mk098649@gndrsh.dnsmgr.net> <71356.1648139436@kaos.jnpr.net> In-Reply-To: <71356.1648139436@kaos.jnpr.net> From: Warner Losh Date: Thu, 24 Mar 2022 13:12:10 -0600 Message-ID: Subject: Re: What's the locale for system files (e.g. /etc/fstab)? To: "Simon J. Gerraty" Cc: "Rodney W. Grimes" , Phil Shafer , FreeBSD Hackers Content-Type: multipart/alternative; boundary="0000000000002b090005dafba03c" X-Rspamd-Queue-Id: 4KPZbq2rRGz3q90 X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=pass header.d=bsdimp-com.20210112.gappssmtp.com header.s=20210112 header.b=ez8bbcb9; dmarc=none; spf=none (mx1.freebsd.org: domain of wlosh@bsdimp.com has no SPF policy when checking 2607:f8b0:4864:20::a31) smtp.mailfrom=wlosh@bsdimp.com X-Spamd-Result: default: False [-1.21 / 15.00]; RCVD_TLS_ALL(0.00)[]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[bsdimp-com.20210112.gappssmtp.com:s=20210112]; NEURAL_HAM_MEDIUM(-0.55)[-0.554]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; NEURAL_HAM_LONG(-0.65)[-0.655]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-hackers@freebsd.org]; DMARC_NA(0.00)[bsdimp.com]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[bsdimp-com.20210112.gappssmtp.com:+]; NEURAL_HAM_SHORT(-1.00)[-1.000]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::a31:from]; MLMMJ_DEST(0.00)[freebsd-hackers]; FORGED_SENDER(0.30)[imp@bsdimp.com,wlosh@bsdimp.com]; R_SPF_NA(0.00)[no SPF record]; MIME_TRACE(0.00)[0:+,1:+,2:~]; SUBJECT_ENDS_QUESTION(1.00)[]; RCVD_COUNT_TWO(0.00)[2]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; FROM_NEQ_ENVFROM(0.00)[imp@bsdimp.com,wlosh@bsdimp.com] X-ThisMailContainsUnwantedMimeParts: N --0000000000002b090005dafba03c Content-Type: text/plain; charset="UTF-8" On Thu, Mar 24, 2022, 10:30 AM Simon J. Gerraty wrote: > Warner Losh wrote: > > Config files, like fstab, have no locale and parsing them with a locale > leads to errors, even when the user or the system has a nondefault locale. > > > > > > > > Put more generally, there's not a system-wide place which declares the > > > encoding for system files, which leads to this problem where we > > > interpret files from one user's locale using another user's locale. > > > > Well /etc/login.conf *IS* a system wide declaration of this type of > > stuff, both lang= and charset= are declared there. > > > > Since system wide files like yhese are always parsed without a locale, > this information is correct, but I'm not sure how it applies. > > > > It is always C.UTF-8. Anything else may, or may not, work based on > accidents of coincident encoding. Not everything can change locales, and > the fstab and other parsing routines in libc assume C.UTF-8 or even just > the ascii-7/8 subset. > > > > > > > > One solution would a symlink in /etc that "points to" the name of the > > > current system-wide locale name. > > > > > > % ls -Fl /etc/locale > > > lrwxr-xr-x 1 root wheel 7 Mar 23 15:42 /etc/locale@ -> C.UTF-8 > > > > grep lang /etc/login.conf: > > :lang=C.UTF-8: > > :lang=ru_RU.UTF-8:\ > > > > Probably what you want? > > I doubt it, one is from the entry for Russian users ;-) > > > > > You can get this with the locale routines, no? No need for grep. > > I suspect not. > > AFAIK virtually everything about locale support tells you about the > locale for the current process - which does not necessarily inform you > of the locale that was in effect when a system file was last edited. > > I don't even know if it is guaranteed that everything that reads system > files groks random locales - or what happens when you have 3 admins each > prefering a different locale, do different entries in fstab for example > get impacted and the result thus not readable by anyone? > > There's probably something to be said for enforcing something like > C.UTF-8 for system files. > That is the primary reason for system files always being C.UTF-8... There is no way to tag it as anything else... and some of these files are often parsed from a context that can't set the locale, like the boot loader or the kernel... also, these files have a format that was defined back in the 7bit ascii time frame. They also don't make use of the text in a way that isn't literal... Having said that, I'm unsure how you'd mount / from fstab, or if that is well defined. The kernel just presents a string of bytes not containing /... Warner --sjg > --0000000000002b090005dafba03c Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Thu, Mar 24, 2022, 10:30 AM Simon J. Gerraty <sjg@juniper.net> wrote:
Warner Losh <imp@bsdimp.com> wrote:
> Config files, like fstab, have no locale and parsing them with a local= e leads to errors, even when the user or the system has a nondefault locale= .
>
> >
> > Put more generally, there's not a system-wide place which dec= lares the
> > encoding for system files, which leads to this problem where we > > interpret files from one user's locale using another user'= ;s locale.
>
> Well /etc/login.conf *IS* a system wide declaration of this type of > stuff, both lang=3D and charset=3D are declared there.
>
> Since system wide files like yhese are always parsed without a locale,= this information is correct, but I'm not sure how it applies.
>
> It is always=C2=A0 C.UTF-8. Anything else may, or may not, work based = on accidents of coincident encoding. Not everything can change locales, and= the fstab and other parsing routines in libc assume C.UTF-8 or even just t= he ascii-7/8 subset.
>
> >
> > One solution would a symlink in /etc that "points to" t= he name of the
> > current system-wide locale name.
> >
> > % ls -Fl /etc/locale
> > lrwxr-xr-x=C2=A0 1 root=C2=A0 wheel=C2=A0 7 Mar 23 15:42 /etc/loc= ale@ -> C.UTF-8
>
> grep lang /etc/login.conf:
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0:lang=3DC.UTF-8:
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0:lang=3Dru_RU.UTF-8:\
>
> Probably what you want?

I doubt it, one is from the entry for Russian users ;-)

>
> You can get this with the locale routines, no? No need for grep.

I suspect not.

AFAIK virtually everything about locale support tells you about the
locale for the current process - which does not necessarily inform you
of the locale that was in effect when a system file was last edited.

I don't even know if it is guaranteed that everything that reads system=
files groks random locales - or what happens when you have 3 admins each prefering a different locale, do different entries in fstab for example
get impacted and the result thus not readable by anyone?

There's probably something to be said for enforcing something like
C.UTF-8 for system files.
That is the primary reason for system files always= being C.UTF-8... There is no way to tag it as anything else... and some of= these files are often parsed from a context that can't set the locale,= like the boot loader or the kernel... also, these files have a format that= was defined back in the 7bit ascii time frame. They also don't make us= e of the text in a way that isn't literal...
Having said that, I'm unsure how you'd mou= nt /<kanji-for-neko> from fstab, or if that is well defined. The kern= el just presents a string of bytes not containing /...

Warner=C2=A0

--sjg
--0000000000002b090005dafba03c--