Re: noatime on ufs2

From: Mike Karels <mike_at_karels.net>
Date: Tue, 30 Jan 2024 18:49:22 UTC
On 30 Jan 2024, at 3:00, Olivier Certner wrote:

> Hi Warner,
>
>> I strongly oppose this notion to control this from loader.conf. Root is
>> mounted read-only, so it doesn't matter. That's why I liked Mike's
>> suggestion: root isn't special.
>
> Then in fact there is nothing to oppose.  You've just said yourself that root is mounted first read-only.  As Mike already said, it is remounted r/w in userland later in the boot process.  I just re-checked the code, because I only had a vague recollection of all this, and can confirm.
>
> I mentioned the need to modify '/etc/loader.conf' as a possible consequence, not as a goal.  Given what we have established, there is no need to change it at all.
>
> The root FS is thus in no way more special in the sysctl proposal than with Mike's (assuming it doesn't rely on sysctl), this is an independent property due to the boot process design.

With the possible exception that the sysctl mechanism might then have to
apply to mount update.

>>>> It also seems undesirable to add a sysctl to control a value that the
>>>> kernel doesn't use.
>>>
>>> The kernel has to use it to guarantee some uniform behavior irrespective
>>> of the mount being performed through mount(8) or by a direct call to
>>> nmount(2).  I think this consistency is important.  Perhaps all
>>> auto-mounters and mount helpers always run mount(8) and never deal with
>>> nmount(2), I would have to check (I seem to remember that, a long time ago,
>>> when nmount(2) was introduced as an enhancement over mount(2), the stance
>>> was that applications should use mount(8) and not nmount(2) directly).
>>> Even if there were no obvious callers of nmount(2), I would be a bit
>>> uncomfortable with this discrepancy in behavior.

Based on a quick git grep, it looks like most of the things in base use
nmount(2), not mount(2).  If they use mount(8), then it's not a problem
because mount(8) would be the first thing to get things right.  If, by
mount helpers, you mean things like mount_nfs and mount_mfs, then mount(8)
uses them rather than the reverse.  I also don't remember any admonition
not to use nmount(2).  mount(8) has a limited set of file system types that
it handles directly.

>> I disagree. I think Mike's suggestion was better and dealt with POLA and
>> POLA breaking in a sane way. If the default is applied universally in user
>> space, then we need not change the kernel at all.
>
> I think applying the changes to userland only is really a bad idea.  I've already explained why, but going to do it again in case you missed that.  If you have counter-arguments, fine, but I would like to see them.
>
> Changing userland only causes a discrepancy between mount(8) and nmount(2).  Even if the project would take a stance that nmount(2) is not a public API and mount(8) must always be used, the system call will still be there.  And if it's not supposed to be used, what's the problem with changing it as well?

I don't think that stance has been taken; nmount(2) is certainly documented.
But I think that user level changes are required in both cases.  First, for
the kernel to do the right thing, it needs to know if either noatime or atime
has been specified explicitly, or if the default should apply.  Otherwise, the
kernel can only force noatime to be used in all cases or none, which I believe
is a non-starter.  Second, for anything using mount(2), the flags include only
MNT_NOATIME, which can only include two options, not the required three.  It
would be possible to add another flag meaning to actually use the state of the
MNT_NOATIME flag, but that would require user-level changes.  Third, if I
understand correctly, mount(8) parses the options and condenses the standard
boolean options like {,no}atime into a bit, preserving the last option
specified.  E.g. if the fstab lists noatime for a file system, and "mount -o
atime ..." is given on the command line, noatime will not be included in
the kernel options.  The kernel can't tell why, whether nothing was specified
or the option was explicit.  In theory, three states can be encoded using
nmount; options could include "atime", "noatime", or neither.  But that's
not what the current user level does, so changes are required.  Given that,
it makes the most sense to have mount(8) and others to incorporate the
default into their operation, and just give the kernel the answer.  btw,
see mntopts(3) for where this code would go.

> Second, we can control what is in the base system, but not other applications, so we can't really prevent nmount(2) to be used.
>
> Some of the goals of my proposal include to simplifying things, both in terms of administration but also in terms of the amount of code, and to provide reliable behavior.  My current evaluation is that changing userland will require more code changes than the sysctl I propose, and it has all the drawbacks I've just mentioned.

I think that all of the user code needs changes in any case, for the reasons
above, so there is no need to change the kernel.

> What I find great in Mike's proposal is to use '/etc/fstab' to control filesystem defaults, because '/etc/fstab' is already the go-to place for filesystems and already holds options to apply to particular mounts.  But again, this is independent of where the mechanism is actually implemented.

Encoding the default as I proposed would make it awkward to communicate to
the kernel.  A startup script that ran early enough could parse it and turn
it into a sysctl, but the encoding works better for C programs that use
the fstab parsing code in mount(8).

>> We lose all the chicken and egg problems and the non-linearness of the sysctl idea.
>
> As already said above, there is in the end no such problem, and it wasn't linked at all with the sysctl idea.

I disagree, for the reasons above.

> On the contrary, with the '/etc/fstab' proposal, if there is no kernel backing, the loader must be modified to parse default options, and then pass them to the kernel (via 'vfs.root.mountfrom.options'), or the script remounting r/w be modified to apply the proper options (or 'mount -u' itself changed to do so).

The loader doesn't need the defaults.  My proposal assumed that mount -u
would implement the default mechanism, just like mount without -u.

>> If it's in fstab as default, then it would be read by whatever updates
>> things in user space.

As described.

> It's very unlikely that applications would not need modifications in this regard.  Mike even said that he wouldn't have getfsent() return such entries to avoid confusing existing programs.  Needing specific code makes this point moot (if you have to modify a program to read and process the special lines in '/etc/fstab', you can as well modify it to use sysctl(8)).

A sysctl would implement the default, but not per-filesystem options.
"mount -o atime /var/mail" should not be setting sysctls.

> The real advantage is direct modifications in a text file by an administrator, and this is why I like the '/etc/fstab' idea.
>
>> It obviates the need for the sysctl entirely.
>
> It doesn't obviate the need for a kernel mechanism (sysctl(8) or else), see argument on mount(8) and nmount(2) above.  And once you need a kernel mechanism, sysctl(8) is most probably the best candidate for tunables (why re-invent the wheel?).

Again, I disagree that having the kernel involved is necessary or
desirable.

>> It gets around the need to update loader.conf as well.
>
> You keep repeating that, but it's false as explained above.
>
>> It concentrates the change in one place and does so in a way that's not at all atime focused:  It could also be
>> generalized so that the FSTYPE could have different settings for different types of filesystem (maybe unique flags that some file systems don't
>> understand).
>
> You can also have this with a properly designed sysctl(8) hierarchy.

That's yet more mechanism that we don't need.

>> I don't like this, because it is atime focused. atime is a trivial little
>> optimization that really isn't worth the effort for the vast majority of
>> things.
>
> Others have disagreed, not going to summarize all the previous mails, there are for anyone to read.
>
>> However, it would be nice to have some way to specify another layer
>> of defaults, like we do for rc variables, loader variables, etc. mount is
>> currently missing that generality. One could also put it in
>> /etc/defaults/fstab too and not break POLA since that's the pattern we use
>> elsewhere.
>
> I also think having the defaults in '/etc/defaults/fstab' would be better because more in line with what we're doing for rc(8) and loader(8).  This would be at the expense of discoverability for adopters, but it seems to be worth it given it applies to other things and has some logic.

The disadvantage of using /etc/defaults/fstab is that it hides the defaults
in a file that didn't previously exist, so people won't know to look there.
/etc/fstab is better in that it is most obvious.

>> I don't think the case for sysctl has been made. It's a big, inelegant
>> hammer that can be solved more elegantly like Mike suggested.
>
> I think it's the exact opposite.  As explained above, the change in defaults must be implemented in the kernel.  The inelegancy of the pure userland solution will become apparent in terms of the necessary changes' content, its higher number of lines of code and its intrinsic unreliability in the face of external applications using nmount(2).

I disagree that the kernel can, or should, implement the change in defaults
without modifying user level.  External programs that use nmount(2) can't do
the right thing *and* follow the defaults, because they don't tell the kernel
how they arrived at the options they provide.

>> It follows the 'tools not rules' philosophy the project has had for decades.
>
> FreeBSD is far from being the only project having it.  Anyway, I've never proposed anything not in these lines.  Can you really argue that the sysctl proposal goes against that?
>
>> Anyway, I've said my piece. I agree with Mike that there's consensus for
>> this from the installer, and after that consensus falls away. Mike's idea
>> is one that I can get behind since it elegantly solves the general problem.
>
> In the current situation, I can back using '/etc/fstab', or probably better, '/usr/local/etc/fstab' to hold default mount options, but I'm strongly opposing a pure userland implementation as long as my objections above are not addressed properly.

We disagree.

		Mike

> Thanks and regards.
>
> -- 
> Olivier Certner