bin/78087: groups program inconsistency

Bruce Evans bde at zeta.org.au
Tue Mar 1 20:40:25 GMT 2005


The following reply was made to PR bin/78087; it has been noted by GNATS.

From: Bruce Evans <bde at zeta.org.au>
To: Russ Francis <russell.francis at gmail.com>
Cc: freebsd-gnats-submit at FreeBSD.org, dds at FreeBSD.org
Subject: Re: bin/78087: groups program inconsistency
Date: Wed, 2 Mar 2005 07:35:55 +1100 (EST)

 On Mon, 28 Feb 2005, Russ Francis wrote:
 
 > I saw this report and 78087 which looks similar and thought they looked curious.
 >
 > I have tried to duplicate this issue without luck and am running
 >
 > FreeBSD 5.3-RELEASE-p2
 >
 > bash-2.05b$ sysctl kern.ngroups
 > kern.ngroups: 16
 >
 > relevant portion of /etc/group
 > --- snip ---
 > foouser:*:1004:
 > foo1:*:20001:foouser
 > foo2:*:20002:foouser
 > foo3:*:20003:foouser
 > foo4:*:20004:foouser
 > foo5:*:20005:foouser
 > foo6:*:20006:foouser
 > foo7:*:20007:foouser
 > foo8:*:20008:foouser
 > foo9:*:20009:foouser
 > foo10:*:20010:foouser
 > foo11:*:20011:foouser
 > foo12:*:20012:foouser
 > foo13:*:20013:foouser
 > foo14:*:20014:foouser
 > #foo15:*:20015:foouser
 
 There is a longstanding off-by-1 bug or two concerning {NGROUPS_MAX}.
 POSIX specifies that {NGROUPS_MAX} gives the number of supplementary
 group ids *in addition* to the effective group id.  {NGROUPS_MAX} is
 16 in FreeBSD so you should be able to have up to foo16 in the above.
 This never worked in FreeBSD, so {NOGROUPS_MAX} is off-by-1.
 
 ISTR that there is another off-by-1 bug in setgroups(2) or in library
 callers of it.  I forget the details.
 
 > su-2.05b$ whoami
 > foouser
 > su-2.05b$ id
 > uid=1004(foouser) gid=1004(foouser) groups=1004(foouser), 20001(foo1),
 > 20002(foo2), 20003(foo3), 20004(foo4), 20005(foo5), 20006(foo6),
 > 20007(foo7), 20008(foo8), 20009(foo9), 20010(foo10), 20011(foo11),
 > 20012(foo12), 20013(foo13), 20014(foo14)
 > su-2.05b$ id foouser
 > uid=1004(foouser) gid=1004(foouser) groups=1004(foouser), 20001(foo1),
 > 20002(foo2), 20003(foo3), 20004(foo4), 20005(foo5), 20006(foo6),
 > 20007(foo7), 20008(foo8), 20009(foo9), 20010(foo10), 20011(foo11),
 > 20012(foo12), 20013(foo13), 20014(foo14)
 >
 > When I uncomment the line '#foo15:*:20015:foouser' from /etc/group
 > I get the following error
 >
 > bash-2.05b$ su foouser
 > Password:
 > su: setusercontext: Invalid argument
 
 There now seems to be an off-by-2 bug.  I have something like the
 above foo[1-15] in my /etc/group for just one user, for the purpose
 of running old POSIX tests (NIST PCTS) of {NGROUPS_MAX}.  This used
 to work, but I now get the above error from su.  login fails too,
 but doesn't print a message.  The POSIX tests only worked because
 I misconfigured them to hide the longstanding off-by-one error:
 
 % SUPPLEMENTARY_GROUPS    "0,1000,1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1011,1012,1013,1014"
 
 The account is supposed to have primary group 0 (wheel) and 16 supplementary
 groups 1000-1015, but #1015 doesn't work so I left it out of /etc/groups
 and fudged the tests to succeed by putting the primary group in the
 supplementary groups.
 
 The new bug is exposed by:
 
 % RCS file: /home/ncvs/src/lib/libc/gen/initgroups.c,v
 % Working file: initgroups.c
 % head: 1.8
 % ...
 % ----------------------------
 % revision 1.8
 % date: 2003/11/19 15:51:26;  author: dds;  state: Exp;  lines: +7 -2
 % Fix problem where initgroups would silently truncate groups with
 % more than NGROUP elements without providing the opportunity to
 % setgroups to fail and correctly return error and set errno.
 % 
 % MFC after:	2 weeks
 % ----------------------------
 % 
 % Index: initgroups.c
 % ===================================================================
 % RCS file: /home/ncvs/src/lib/libc/gen/initgroups.c,v
 % retrieving revision 1.7
 % retrieving revision 1.8
 % diff -u -2 -r1.7 -r1.8
 % --- initgroups.c	1 Feb 2002 00:57:29 -0000	1.7
 % +++ initgroups.c	19 Nov 2003 15:51:26 -0000	1.8
 % ...
 % @@ -51,7 +51,12 @@
 %  	gid_t agroup;
 %  {
 % -	int groups[NGROUPS], ngroups;
 % +	int ngroups;
 % +	/*
 % +	 * Provide space for one group more than NGROUPS to allow
 % +	 * setgroups to fail and set errno.
 % +	 */
 % +	gid_t groups[NGROUPS + 1];
 
 This has some style bugs.
 
 % 
 % -	ngroups = NGROUPS;
 % +	ngroups = NGROUPS + 1;
 %  	getgrouplist(uname, agroup, groups, &ngroups);
 %  	return (setgroups(ngroups, groups));
 
 getgrouplist() puts `agroup' in groups[0] and groups[1], so there are
 only 14 slots left for the supplementary groups.  Thus when there are
 15 supplementary groups all different from `agroup', the above change
 makes ngroups = 17 and setgroups() fails.  Immediately before the above
 change, the last group was just discarded so it couldn't have worked,
 but think PCTS would have found the bug if the last group had not worked
 long ago.
 
 getgrouplist() has to put `agroup' in groups[0] because the group list
 has to include the primary group in FreeBSD due to a dubious efficiency
 hack in the implementation -- in FreeBSD, the effective group id is
 just groups[0].  This efficiency hacks is what gives the first off-by-1
 error -- {NGROUPS_MAX} iis missing the adjustment for the hack.
 
 getgrouplist() normally puts `agroup' in groups[1] according to the
 following comment:
 
 % 	/*
 % 	 * When installing primary group, duplicate it;
 % 	 * the first element of groups is the effective gid
 % 	 * and will be overwritten when a setgid file is executed.
 % 	 */
 
 I don't understand why this is needed.  It obviously increments the off-by-1
 error count, since the setgroups(2) up immediately in the kernel if the
 groups count exceeds {NGROUPS_MAX}.
 
 getgrouplist() has changed signficantly near this bug since rev.1.1, but
 the change only reduces the bugs:
 
 % Index: getgrouplist.c
 % ===================================================================
 % RCS file: /home/ncvs/src/lib/libc/gen/getgrouplist.c,v
 % retrieving revision 1.1
 % retrieving revision 1.13
 % diff -u -2 -r1.1 -r1.13
 % --- getgrouplist.c	27 May 1994 04:56:40 -0000	1.1
 % +++ getgrouplist.c	16 Feb 2003 17:29:09 -0000	1.13
 % ...
 % @@ -70,17 +70,21 @@
 %  	 */
 %  	setgrent();
 % -	while (grp = getgrent()) {
 % -		if (grp->gr_gid == agroup)
 % -			continue;
 % -		if (ngroups >= maxgroups) {
 % -			ret = -1;
 % -			break;
 % +	while ((grp = getgrent()) != NULL) {
 % +		for (i = 0; i < ngroups; i++) {
 % +			if (grp->gr_gid == groups[i])
 % +				goto skip;
 
 Now getgrouplist() avoids wasting slots for duplicates in all cases
 after the first 1 or 2 slots, where it use to only avoid duplicates of
 `agroup'.  Avoiding duplicates allows {NGROUPS_MAX} groups to actually
 work in some cases, but cases where there are even {NGROUPS_MAX} - 1
 different groups all different from `agroup' apparently never worked.
 
 The kernel doesn't do any duplicate removal but probably should.
 POSIX doesn't specify setgroups(), but it specifies getgroups(2).
 For getgroups(), it is implementation-defined whether the effective
 group is returned, and applications must be prepared for
 {NGROUPS_MAX} + 1 gids being reurned.  FreeBSD does return the
 egid in the array.  FreeBSD's setgroups needs the egid in slot 0
 in the array.
 
 Bruce


More information about the freebsd-bugs mailing list