Re: git: be1f7435ef21 - main - kern: start tracking cr_gid outside of cr_groups[]

From: Kyle Evans <kevans_at_FreeBSD.org>
Date: Thu, 31 Jul 2025 19:26:45 UTC
On 7/31/25 14:00, Kyle Evans wrote:
> On 7/31/25 13:46, Mark Johnston wrote:
>> On Thu, Jul 31, 2025 at 04:44:38AM +0000, Kyle Evans wrote:
>>> The branch main has been updated by kevans:
>>>
>>> URL: https://cgit.FreeBSD.org/src/commit/? 
>>> id=be1f7435ef218b1df35aebf3b90dd65ffd8bbe51
>>>
>>> commit be1f7435ef218b1df35aebf3b90dd65ffd8bbe51
>>> Author:     Kyle Evans <kevans@FreeBSD.org>
>>> AuthorDate: 2025-07-31 04:44:11 +0000
>>> Commit:     Kyle Evans <kevans@FreeBSD.org>
>>> CommitDate: 2025-07-31 04:44:11 +0000
>>>
>>>      kern: start tracking cr_gid outside of cr_groups[]
>>>      This is the (mostly) kernel side of de-conflating cr_gid and the
>>>      supplemental groups.  The pre-existing behavior for getgroups() and
>>>      setgroups() is retained to keep the user <-> kernel boundary
>>>      functionally the same while we audit use of these syscalls, but 
>>> we can
>>>      remove a lot of the internal special-casing just by reorganizing 
>>> ucred
>>>      like this.
>>>      struct xucred has been altered because the cr_gid macro becomes
>>>      problematic if ucred has a real cr_gid member but xucred does 
>>> not.  Most
>>>      notably, they both also have cr_groups[] members, so the definition
>>>      means that we could easily have situations where we end up using 
>>> the
>>>      first supplemental group as the egid in some places.  We really 
>>> can't
>>>      change the ABI of xucred, so instead we alias the first member 
>>> to the
>>>      `cr_gid` name and maintain the status quo.
>>>      This also fixes the Linux setgroups(2)/getgroups(2) 
>>> implementation to
>>>      more cleanly preserve the group set, now that we don't need to 
>>> special
>>>      case cr_groups[0].
>>>      __FreeBSD_version bumped for the `struct ucred` ABI break.
>>>      For relnotes: downstreams and out-of-tree modules absolutely 
>>> must fix
>>>      any references to cr_groups[0] in their code.  These are almost
>>>      exclusively incorrect in the new world, and cr_gid should be used
>>>      instead.  There is a cr_gid macro available in earlier FreeBSD 
>>> versions
>>>      that can be used to avoid having version-dependant conditionals 
>>> to refer
>>>      to the effective group id.  Surrounding code may need adjusted 
>>> if it
>>>      peels off the first element of cr_groups and uses the others as the
>>>      supplemental groups, since the supplemental groups start at 
>>> cr_groups[0]
>>>      now if &cr_groups[0] != &cr_gid.
>>>      Relnotes:       yes (see last paragraph)
>>>      Co-authored-by: olce
>>>      Differential Revision:  https://reviews.freebsd.org/D51489
>>
>> This syzbot report looks like it might be related to this change:
>> https://syzkaller.appspot.com/bug?extid=4e68da43c26f357a2b7e
>>
>> No reproducer yet, but sometimes it takes a little while.
> 
> I'll keep an eye out, thanks.  It strikes me that crsetgroups_internal 
> should probably grow a groups_check_max_len() call; this assertion 
> likely would have happened in a much more useful spot in the first place.
> 
> Thanks,
> 
> Kyle Evans

Actually, what crcopysafe() doing here is not ideal.  Note the comment 
from crextend():

2812         /* 

2813          * We allocate a power of 2 larger than 'nbytes', except 
when that
2814          * exceeds PAGE_SIZE, in which case we allocate the right 
multiple of
2815          * pages.  We assume PAGE_SIZE is a power of 2 (the call to 
roundup2()
2816          * below) but do not need to for sizeof(gid_t). 

2817          */ 


and then:

2830         cr->cr_agroups = nbytes / sizeof(gid_t); 


crcopysafe() then attempts to extend up to what cr_agroups can hold, but 
cr_agroups isn't guaranteed to be <= ngroups_max since we allocate a 
power of 2 larger; we'll want to cap that.

I was considering whether we should do it at assignment time or in 
crcopysafe().  I'm kind of leaning toward the latter, and slapping a 
note on cr_agroups somewhere that it may exceed the limit.  OTOH, we 
don't have a compelling need for cr_agroups to reflect the size of the 
allocation beyond the max number of groups today.

Thanks,

Kyle Evans

Thanks,

Kyle Evans