bin/78087: groups program inconsistency
Bruce Evans
bde at zeta.org.au
Tue Mar 1 20:40:25 GMT 2005
The following reply was made to PR bin/78087; it has been noted by GNATS.
From: Bruce Evans <bde at zeta.org.au>
To: Russ Francis <russell.francis at gmail.com>
Cc: freebsd-gnats-submit at FreeBSD.org, dds at FreeBSD.org
Subject: Re: bin/78087: groups program inconsistency
Date: Wed, 2 Mar 2005 07:35:55 +1100 (EST)
On Mon, 28 Feb 2005, Russ Francis wrote:
> I saw this report and 78087 which looks similar and thought they looked curious.
>
> I have tried to duplicate this issue without luck and am running
>
> FreeBSD 5.3-RELEASE-p2
>
> bash-2.05b$ sysctl kern.ngroups
> kern.ngroups: 16
>
> relevant portion of /etc/group
> --- snip ---
> foouser:*:1004:
> foo1:*:20001:foouser
> foo2:*:20002:foouser
> foo3:*:20003:foouser
> foo4:*:20004:foouser
> foo5:*:20005:foouser
> foo6:*:20006:foouser
> foo7:*:20007:foouser
> foo8:*:20008:foouser
> foo9:*:20009:foouser
> foo10:*:20010:foouser
> foo11:*:20011:foouser
> foo12:*:20012:foouser
> foo13:*:20013:foouser
> foo14:*:20014:foouser
> #foo15:*:20015:foouser
There is a longstanding off-by-1 bug or two concerning {NGROUPS_MAX}.
POSIX specifies that {NGROUPS_MAX} gives the number of supplementary
group ids *in addition* to the effective group id. {NGROUPS_MAX} is
16 in FreeBSD so you should be able to have up to foo16 in the above.
This never worked in FreeBSD, so {NOGROUPS_MAX} is off-by-1.
ISTR that there is another off-by-1 bug in setgroups(2) or in library
callers of it. I forget the details.
> su-2.05b$ whoami
> foouser
> su-2.05b$ id
> uid=1004(foouser) gid=1004(foouser) groups=1004(foouser), 20001(foo1),
> 20002(foo2), 20003(foo3), 20004(foo4), 20005(foo5), 20006(foo6),
> 20007(foo7), 20008(foo8), 20009(foo9), 20010(foo10), 20011(foo11),
> 20012(foo12), 20013(foo13), 20014(foo14)
> su-2.05b$ id foouser
> uid=1004(foouser) gid=1004(foouser) groups=1004(foouser), 20001(foo1),
> 20002(foo2), 20003(foo3), 20004(foo4), 20005(foo5), 20006(foo6),
> 20007(foo7), 20008(foo8), 20009(foo9), 20010(foo10), 20011(foo11),
> 20012(foo12), 20013(foo13), 20014(foo14)
>
> When I uncomment the line '#foo15:*:20015:foouser' from /etc/group
> I get the following error
>
> bash-2.05b$ su foouser
> Password:
> su: setusercontext: Invalid argument
There now seems to be an off-by-2 bug. I have something like the
above foo[1-15] in my /etc/group for just one user, for the purpose
of running old POSIX tests (NIST PCTS) of {NGROUPS_MAX}. This used
to work, but I now get the above error from su. login fails too,
but doesn't print a message. The POSIX tests only worked because
I misconfigured them to hide the longstanding off-by-one error:
% SUPPLEMENTARY_GROUPS "0,1000,1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1011,1012,1013,1014"
The account is supposed to have primary group 0 (wheel) and 16 supplementary
groups 1000-1015, but #1015 doesn't work so I left it out of /etc/groups
and fudged the tests to succeed by putting the primary group in the
supplementary groups.
The new bug is exposed by:
% RCS file: /home/ncvs/src/lib/libc/gen/initgroups.c,v
% Working file: initgroups.c
% head: 1.8
% ...
% ----------------------------
% revision 1.8
% date: 2003/11/19 15:51:26; author: dds; state: Exp; lines: +7 -2
% Fix problem where initgroups would silently truncate groups with
% more than NGROUP elements without providing the opportunity to
% setgroups to fail and correctly return error and set errno.
%
% MFC after: 2 weeks
% ----------------------------
%
% Index: initgroups.c
% ===================================================================
% RCS file: /home/ncvs/src/lib/libc/gen/initgroups.c,v
% retrieving revision 1.7
% retrieving revision 1.8
% diff -u -2 -r1.7 -r1.8
% --- initgroups.c 1 Feb 2002 00:57:29 -0000 1.7
% +++ initgroups.c 19 Nov 2003 15:51:26 -0000 1.8
% ...
% @@ -51,7 +51,12 @@
% gid_t agroup;
% {
% - int groups[NGROUPS], ngroups;
% + int ngroups;
% + /*
% + * Provide space for one group more than NGROUPS to allow
% + * setgroups to fail and set errno.
% + */
% + gid_t groups[NGROUPS + 1];
This has some style bugs.
%
% - ngroups = NGROUPS;
% + ngroups = NGROUPS + 1;
% getgrouplist(uname, agroup, groups, &ngroups);
% return (setgroups(ngroups, groups));
getgrouplist() puts `agroup' in groups[0] and groups[1], so there are
only 14 slots left for the supplementary groups. Thus when there are
15 supplementary groups all different from `agroup', the above change
makes ngroups = 17 and setgroups() fails. Immediately before the above
change, the last group was just discarded so it couldn't have worked,
but think PCTS would have found the bug if the last group had not worked
long ago.
getgrouplist() has to put `agroup' in groups[0] because the group list
has to include the primary group in FreeBSD due to a dubious efficiency
hack in the implementation -- in FreeBSD, the effective group id is
just groups[0]. This efficiency hacks is what gives the first off-by-1
error -- {NGROUPS_MAX} iis missing the adjustment for the hack.
getgrouplist() normally puts `agroup' in groups[1] according to the
following comment:
% /*
% * When installing primary group, duplicate it;
% * the first element of groups is the effective gid
% * and will be overwritten when a setgid file is executed.
% */
I don't understand why this is needed. It obviously increments the off-by-1
error count, since the setgroups(2) up immediately in the kernel if the
groups count exceeds {NGROUPS_MAX}.
getgrouplist() has changed signficantly near this bug since rev.1.1, but
the change only reduces the bugs:
% Index: getgrouplist.c
% ===================================================================
% RCS file: /home/ncvs/src/lib/libc/gen/getgrouplist.c,v
% retrieving revision 1.1
% retrieving revision 1.13
% diff -u -2 -r1.1 -r1.13
% --- getgrouplist.c 27 May 1994 04:56:40 -0000 1.1
% +++ getgrouplist.c 16 Feb 2003 17:29:09 -0000 1.13
% ...
% @@ -70,17 +70,21 @@
% */
% setgrent();
% - while (grp = getgrent()) {
% - if (grp->gr_gid == agroup)
% - continue;
% - if (ngroups >= maxgroups) {
% - ret = -1;
% - break;
% + while ((grp = getgrent()) != NULL) {
% + for (i = 0; i < ngroups; i++) {
% + if (grp->gr_gid == groups[i])
% + goto skip;
Now getgrouplist() avoids wasting slots for duplicates in all cases
after the first 1 or 2 slots, where it use to only avoid duplicates of
`agroup'. Avoiding duplicates allows {NGROUPS_MAX} groups to actually
work in some cases, but cases where there are even {NGROUPS_MAX} - 1
different groups all different from `agroup' apparently never worked.
The kernel doesn't do any duplicate removal but probably should.
POSIX doesn't specify setgroups(), but it specifies getgroups(2).
For getgroups(), it is implementation-defined whether the effective
group is returned, and applications must be prepared for
{NGROUPS_MAX} + 1 gids being reurned. FreeBSD does return the
egid in the array. FreeBSD's setgroups needs the egid in slot 0
in the array.
Bruce
More information about the freebsd-bugs
mailing list