Re: git: a4e4132fa3bf - main - swapoff(2): replace special device name argument with a structure
- Reply: Konstantin Belousov : "Re: git: a4e4132fa3bf - main - swapoff(2): replace special device name argument with a structure"
- In reply to: Jessica Clarke : "Re: git: a4e4132fa3bf - main - swapoff(2): replace special device name argument with a structure"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sun, 05 Dec 2021 19:38:52 UTC
On 5 Dec 2021, at 12:56, Jessica Clarke wrote:
> On 5 Dec 2021, at 18:51, Konstantin Belousov <kostikbel@gmail.com> wrote:
>> On Sun, Dec 05, 2021 at 05:14:54PM +0000, Jessica Clarke wrote:
>>> On 5 Dec 2021, at 13:22, Konstantin Belousov <kostikbel@gmail.com> wrote:
>>>>
>>>> On Sun, Dec 05, 2021 at 03:03:26AM +0000, Jessica Clarke wrote:
>>>>> On 4 Dec 2021, at 22:21, Konstantin Belousov <kib@FreeBSD.org> wrote:
>>>>>>
>>>>>> The branch main has been updated by kib:
>>>>>>
>>>>>> URL: https://cgit.FreeBSD.org/src/commit/?id=a4e4132fa3bfadb6047fc0fa5f399f4640460300
>>>>>>
>>>>>> commit a4e4132fa3bfadb6047fc0fa5f399f4640460300
>>>>>> Author: Konstantin Belousov <kib@FreeBSD.org>
>>>>>> AuthorDate: 2021-11-29 16:26:31 +0000
>>>>>> Commit: Konstantin Belousov <kib@FreeBSD.org>
>>>>>> CommitDate: 2021-12-04 22:20:58 +0000
>>>>>>
>>>>>> swapoff(2): replace special device name argument with a structure
>>>>>>
>>>>>> For compatibility, add a placeholder pointer to the start of the
>>>>>> added struct swapoff_new_args, and use it to distinguish old vs. new
>>>>>> style of syscall invocation.
>>>>>>
>>>>>> Reviewed by: markj
>>>>>> Discussed with: alc
>>>>>> Sponsored by: The FreeBSD Foundation
>>>>>> MFC after: 1 week
>>>>>> Differential revision: https://reviews.freebsd.org/D33165
>>>>>> ---
>>>>>> sys/vm/swap_pager.c | 27 +++++++++++++++++++++++++--
>>>>>> sys/vm/swap_pager.h | 8 ++++++++
>>>>>> 2 files changed, 33 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/sys/vm/swap_pager.c b/sys/vm/swap_pager.c
>>>>>> index 165373d1b527..dc1df79f4fcd 100644
>>>>>> --- a/sys/vm/swap_pager.c
>>>>>> +++ b/sys/vm/swap_pager.c
>>>>>> @@ -2491,15 +2491,38 @@ sys_swapoff(struct thread *td, struct swapoff_args *uap)
>>>>>> struct vnode *vp;
>>>>>> struct nameidata nd;
>>>>>> struct swdevt *sp;
>>>>>> - int error;
>>>>>> + struct swapoff_new_args sa;
>>>>>> + int error, probe_byte;
>>>>>>
>>>>>> error = priv_check(td, PRIV_SWAPOFF);
>>>>>> if (error)
>>>>>> return (error);
>>>>>>
>>>>>> + /*
>>>>>> + * Detect old vs. new-style swapoff(2) syscall. The first
>>>>>> + * pointer in the memory pointed to by uap->name is NULL for
>>>>>> + * the new variant.
>>>>>> + */
>>>>>> + probe_byte = fubyte(uap->name);
>>>>>> + switch (probe_byte) {
>>>>>> + case -1:
>>>>>> + return (EFAULT);
>>>>>> + case 0:
>>>>>> + error = copyin(uap->name, &sa, sizeof(sa));
>>>>>> + if (error != 0)
>>>>>> + return (error);
>>>>>> + if (sa.flags != 0)
>>>>>> + return (EINVAL);
>>>>>> + break;
>>>>>> + default:
>>>>>> + bzero(&sa, sizeof(sa));
>>>>>> + sa.name = uap->name;
>>>>>> + break;
>>>>>> + }
>>>>>
>>>>> Doesn’t this change the semantics of swapoff("")?
>>>>>
>>>>> Previously it would fail deterministically, presumably with ENOENT or
>>>>> something, but now it reinterprets whatever follows that string in
>>>>> memory as the new argument structure. It probably doesn’t matter, but
>>>>> this approach is ugly. Can we not just define a new syscall rather than
>>>>> this kind of bodge?
>>>>
>>>> Having two swapoff() syscalls is worse, and having them only differ in
>>>> semantic by single flag is kind of crime.
>>>>
>>>> I do not see swapoff("") as problematic, we are changing a minor semantic of
>>>> the management syscall. I only wanted to avoid flag day for swapoff binaries.
>>>>
>>>> BTW, I considered requiring proper alignment for uap->name, and then checking
>>>> the whole uap->name_old_syscall for NULL, but then decided that this is
>>>> overkill. If you think that swapoff("") that important, I can add that
>>>> additional verification.
>>>
>>> Why’s it worse? It’s just a syscall number, you deprecate the old one
>>> and move on, we do that for things relatively regularly. This is really
>>> not a good solution; harder to use as a caller since the prototype is
>>> wrong, impossible to ensure you preserve the semantics for the existing
>>> interface in all cases, and ugly to implement. You don’t need a flag
>>> day for a new syscall, either, you can continue to only use the new
>>> method for -f for a release and then switch over to the new syscall
>>> entirely. Or switch over to the new syscall entirely now and fall back
>>> on the old syscall if -f isn’t passed. Defining a new syscall also lets
>>> you not need the name_old_syscall member in the struct, and gives you a
>>> clean, fully-extensible syscall to which future features can be added
>>> in a backwards-compatible way, rather than forever keeping around this
>>> legacy mess.
>>
>> I disagree, it is not just a syscall number, it is whole user/kernel
>> interface that bloats, which means cognitive efforts from anybody using
>> this interfaces, and for which we must maintain ABI compatibility.
>
> Which is just as true of this approach; you have the same two
> interfaces here, just smashed together into a single harder-to-use
> syscall rather than two separate syscalls. Having a separate syscall at
> least allows the old one to return ENOSYS in the future, whereas if you
> ever want to deprecate the old interface with this method then you’ll
> need some other weird error response that’s harder to interpret as
> meaning “that variant of this syscall doesn’t exist any more”.
>
>> New syscall allocation should be done only as a last resort, when existing
>> interfaces cannot be adopted for new functionality.
>
> Which this can’t without breaking the existing well-defined semantics,
> as I’ve stated.
>
>> Good (or rather, bad) example of the uglyness that is backed by the attitude
>> that syscalls are free, is whole *at() mess, or specific stat*() mess (old,
>> other bsds, pre-ino64, ino64, at, then stat vs fstat, then Linux statx which
>> probably fixes the interface ultimately).
>
> It’s better than this approach.
I have less resistance to adding new syscalls, and I agree with Jess that it
is the right thing to do in this case. Adding a syscall means the kernel
supports either old or new interface, so there is no flag day. And it is
easier to clean up; maintaining two syscalls should only be needed for a
while on -current, and not in any release.
Mike
> Jess