Large Capsicum patch for review.

Mon Feb 25 10:35:56 UTC 2013

On 25.02.2013 00:59, Pawel Jakub Dawidek wrote:
> On Sun, Feb 24, 2013 at 04:07:12PM +0100, Christoph Mallon wrote:
>> On 24.02.2013 07:05, Christoph Mallon wrote:
>> - Why did you choose INT_MAX to represent "all capabilities"?
>>   The data type used is ssize_t, so SSIZE_MAX seems more logical.
>>   Further, there are several places where size_t and ssize_t come into contact, which need careful tests and casts to get right.
>>   Maybe it is better to always use size_t and represent "all capabilities" with SIZE_T_MAX or (size_t)-1 (which is guaranteed by C to be the maximal value).
> 
> This is not used for capabilities, but for white-listing ioctl commands.
> INT_MAX seems to just be least odd. We only allow for 256 anyway.
> I could provide some dedicated define for it, eg.
> 
> #define	CAP_IOCTLS_ALL	<some random big number>

A nice name is a good step.
Still, I would prefer, if consistently only one type (size_t) would be used instead of two/three (ssize_t, size_t and int for the constant).

>> - Is it correct, that fdp seems to be not locked in fp_getfvp()?
>>   Otherweise, fget_locked() could be used instead of the manual check.
> 
> Yeah, I don't like this too, but AFAIR it doesn't have to be locked, as
> it is used by nfsd processes only that don't manipulate filedesc table.
> This is at least what I recall I was told.

This deserves a comment in the code.

>> - I could not verify many changes, e.g. changed requested permissions, because this is just a big diff with no information about the intention of the changes.
>>   A series of smaller diffs with commit logs to state their intent would be really useful for reviewing.
> 
> The entire history is in perforce, but there are many commits in there.

So effectively the history is lost.

> The key goals of the patch are:
> 
> - Move from special capability descriptors that were used to keep
>   capability rights in the file structure to ability to configure
>   capability rights on any descriptor type.
> 
> - Make capability rights more practical.
> 
> - Allow for ioctl(2) in capability mode by providing a way to limit
>   permitted ioctl commands.
> 
> - Allow for limiting permitted fcntl(2) commands.

In a branch in svn one could reread these steps on every commit.
Or even better, use git(-svn), which can automatically create a nice patch series (like the one I provided).

>> --- a/lib/libc/gen/cap_sandboxed.c
>> +++ b/lib/libc/gen/cap_sandboxed.c
>> @@ -39,7 +39,7 @@ __FBSDID("$FreeBSD$");
>>  bool
>>  cap_sandboxed(void)
>>  {
>> -	int mode;
>> +	u_int mode;
>>  
>>  	if (cap_getmode(&mode) == -1) {
>>  		assert(errno == ENOSYS);
> 
> Applied, although I prefer 'unsigned int' in userland.

cap_getmode() is declared with u_int, so I used it.

>> --- a/lib/libc/gen/cap_sandboxed.c
>> +++ b/lib/libc/gen/cap_sandboxed.c
>> @@ -41,10 +41,10 @@ cap_sandboxed(void)
>>  {
>>  	u_int mode;
>>  
>> -	if (cap_getmode(&mode) == -1) {
>> +	if (cap_getmode(&mode) != 0) {
>>  		assert(errno == ENOSYS);
>>  		return (false);
>>  	}
>>  	assert(mode == 0 || mode == 1);
>> -	return (mode == 1);
>> +	return (mode != 0);
>>  }
> 
> I prefer comparison with -1, so I skipped this one.

It is always suspicious, if a magic number besides 0 is used.

>> From 1d1defdf7dd117afa56330c958daeac5d52be05f Mon Sep 17 00:00:00 2001
>> From: Christoph Mallon <christoph.mallon at gmx.de>
>> Date: Sun, 24 Feb 2013 13:04:41 +0100
>> Subject: [PATCH 06/24] Avoid polluting the global namespace with stdbool.h.
>>
>> ---
>>  sys/sys/capability.h | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/sys/sys/capability.h b/sys/sys/capability.h
>> index df95ae4..9eed4d1 100644
>> --- a/sys/sys/capability.h
>> +++ b/sys/sys/capability.h
>> @@ -251,7 +251,6 @@ int	cap_fcntl_check(struct filedesc *fdp, int fd, int cmd);
>>  #else /* !_KERNEL */
>>  
>>  __BEGIN_DECLS
>> -#include <stdbool.h>
>>  
>>  /*
>>   * cap_enter(): Cause the process to enter capability mode, which will
>> @@ -270,7 +269,7 @@ int	cap_enter(void);
>>   * Are we sandboxed (in capability mode)?
>>   * This is libc wrapper around cap_getmode(2) system call.
>>   */
>> -bool	cap_sandboxed(void);
>> +_Bool	cap_sandboxed(void);
>>  
>>  /*
>>   * cap_getmode(): Are we in capability mode?
> 
> Not sure yet about this one.

The manpage says, you need to include stdbool.h yourself.

>> From d4c8e6bbeb678867b2bd1c0ec09faf60642465d0 Mon Sep 17 00:00:00 2001
>> From: Christoph Mallon <christoph.mallon at gmx.de>
>> Date: Sun, 24 Feb 2013 13:43:28 +0100
>> Subject: [PATCH 14/24] Avoide double test.
>>
>> ---
>>  sys/kern/sys_capability.c | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/sys/kern/sys_capability.c b/sys/kern/sys_capability.c
>> index ecb3a6b..dbbda01 100644
>> --- a/sys/kern/sys_capability.c
>> +++ b/sys/kern/sys_capability.c
>> @@ -314,12 +314,12 @@ cap_ioctl_limit_check(struct filedesc *fdp, int fd, const u_long *cmds,
>>  
>>  	ocmds = fdp->fd_ofiles[fd].fde_ioctls;
>>  	for (i = 0; i < ncmds; i++) {
>> -		for (j = 0; j < oncmds; j++) {
>> +		for (j = 0;; j++) {
>> +			if (j == oncmds)
>> +				return (ENOTCAPABLE);
>>  			if (cmds[i] == ocmds[j])
>>  				break;
>>  		}
>> -		if (j == oncmds)
>> -			return (ENOTCAPABLE);
>>  	}
>>  
>>  	return (0);
> 
> I decided to skip this one. My version is more readable, IMHO, as it is
> used for other cases like TAILQ_FOREACH(), etc. where the check is
> already done by the macro.

The duplicate tests do not even match!
You have to carefully read the code to verify, that if indeed only triggers, when the for loop terminated without break.
Code duplication is almost always bad, especially if the duplicated code does not fully match.

>> From 64e66277c265b3a1221d9e269c8f33345e43332c Mon Sep 17 00:00:00 2001
>> From: Christoph Mallon <christoph.mallon at gmx.de>
>> Date: Sun, 24 Feb 2013 13:49:37 +0100
>> Subject: [PATCH 17/24] Avoid code duplication on error paths.
>>
>> ---
>>  sys/kern/sys_capability.c | 42 ++++++++++++++++++------------------------
>>  1 file changed, 18 insertions(+), 24 deletions(-)
>>
>> diff --git a/sys/kern/sys_capability.c b/sys/kern/sys_capability.c
>> index 153364b..e87a4e5 100644
>> --- a/sys/kern/sys_capability.c
>> +++ b/sys/kern/sys_capability.c
>> @@ -344,37 +344,32 @@ sys_cap_ioctls_limit(struct thread *td, struct cap_ioctls_limit_args *uap)
>>  	} else {
>>  		cmds = malloc(sizeof(cmds[0]) * ncmds, M_TEMP, M_WAITOK);
>>  		error = copyin(uap->cmds, cmds, sizeof(cmds[0]) * ncmds);
>> -		if (error != 0) {
>> -			free(cmds, M_TEMP);
>> -			return (error);
>> -		}
>> +		if (error != 0)
>> +			goto out;
>>  	}
>>  
>>  	fdp = td->td_proc->p_fd;
>>  	FILEDESC_XLOCK(fdp);
>>  
>>  	if (fget_locked(fdp, fd) == NULL) {
>> -		FILEDESC_XUNLOCK(fdp);
>> -		free(cmds, M_TEMP);
>> -		return (EBADF);
>> +		error = EBADF;
>> +		goto out_locked;
>>  	}
>>  
>>  	error = cap_ioctl_limit_check(fdp, fd, cmds, ncmds);
> 
> Applied, but I removed first goto and now out label is placed before
> unlock.

This leaves code duplication in place, which is error prone.

>> From 42aaaa14c5c1a5d96503a88a381909d6c3254727 Mon Sep 17 00:00:00 2001
>> From: Christoph Mallon <christoph.mallon at gmx.de>
>> Date: Sun, 24 Feb 2013 14:04:36 +0100
>> Subject: [PATCH 20/24] Avoid comparing signed with unsigned values.
>>
>> ---
>>  sys/fs/nfsserver/nfs_nfsdport.c | 2 +-
>>  sys/kern/sys_capability.c       | 7 ++++---
>>  2 files changed, 5 insertions(+), 4 deletions(-)
>>
>> diff --git a/sys/fs/nfsserver/nfs_nfsdport.c b/sys/fs/nfsserver/nfs_nfsdport.c
>> index 880f965..69f5629 100644
>> --- a/sys/fs/nfsserver/nfs_nfsdport.c
>> +++ b/sys/fs/nfsserver/nfs_nfsdport.c
>> @@ -2775,7 +2775,7 @@ fp_getfvp(struct thread *p, int fd, struct file **fpp, struct vnode **vpp)
>>  	int error = 0;
>>  
>>  	fdp = p->td_proc->p_fd;
>> -	if ((u_int)fd >= fdp->fd_nfiles ||
>> +	if (fd < 0 || fdp->fd_nfiles <= fd ||
>>  	    (fp = fdp->fd_ofiles[fd].fde_file) == NULL) {
>>  		error = EBADF;
>>  		goto out;
>> diff --git a/sys/kern/sys_capability.c b/sys/kern/sys_capability.c
>> index e87a4e5..e3622b0 100644
>> --- a/sys/kern/sys_capability.c
>> +++ b/sys/kern/sys_capability.c
>> @@ -274,7 +274,7 @@ cap_ioctl_check(struct filedesc *fdp, int fd, u_long cmd)
>>  {
>>  	u_long *cmds;
>>  	ssize_t ncmds;
>> -	u_int i;
>> +	ssize_t i;
>>  
>>  	FILEDESC_LOCK_ASSERT(fdp);
>>  	KASSERT(fd >= 0 && fd < fdp->fd_nfiles,
>> @@ -302,12 +302,13 @@ cap_ioctl_limit_check(struct filedesc *fdp, int fd, const u_long *cmds,
>>  {
>>  	u_long *ocmds;
>>  	ssize_t oncmds;
>> -	u_int i, j;
>> +	size_t i;
>> +	ssize_t j;
>>  
>>  	oncmds = fdp->fd_ofiles[fd].fde_nioctls;
>>  	if (oncmds == -1)
>>  		return (0);
>> -	if (oncmds < ncmds)
>> +	if ((size_t)oncmds < ncmds)
>>  		return (ENOTCAPABLE);
>>  
>>  	ocmds = fdp->fd_ofiles[fd].fde_ioctls;
> 
> Applied with some minor tweaks.

Why did you mismatch the types of i and ncmds as well as j and oncmds?

>> From a05144e19411d2b3fc8716e72ef2fad7cc9449ae Mon Sep 17 00:00:00 2001
>> From: Christoph Mallon <christoph.mallon at gmx.de>
>> Date: Sun, 24 Feb 2013 15:09:16 +0100
>> Subject: [PATCH 21/24] For readability simplify if (x) y = a; else y = b; with
>>  long y to y = x ? a : b.
>>
>> ---
>>  sys/kern/kern_descrip.c   | 10 +++-------
>>  sys/kern/sys_capability.c |  6 ++----
>>  2 files changed, 5 insertions(+), 11 deletions(-)
>>
>> diff --git a/sys/kern/kern_descrip.c b/sys/kern/kern_descrip.c
>> index 612d51a..30dac04 100644
>> --- a/sys/kern/kern_descrip.c
>> +++ b/sys/kern/kern_descrip.c
>> @@ -842,13 +842,9 @@ do_dup(struct thread *td, int flags, int old, int new,
>>  	fdp->fd_ofiles[new] = fdp->fd_ofiles[old];
>>  	filecaps_copy(&fdp->fd_ofiles[old].fde_caps,
>>  	    &fdp->fd_ofiles[new].fde_caps);
>> -	if ((flags & DUP_CLOEXEC) != 0) {
>> -		fdp->fd_ofiles[new].fde_flags =
>> -		    fdp->fd_ofiles[old].fde_flags | UF_EXCLOSE;
>> -	} else {
>> -		fdp->fd_ofiles[new].fde_flags =
>> -		    fdp->fd_ofiles[old].fde_flags & ~UF_EXCLOSE;
>> -	}
>> +	fdp->fd_ofiles[new].fde_flags = flags & DUP_CLOEXEC ?
>> +	    fdp->fd_ofiles[old].fde_flags |  UF_EXCLOSE :
>> +	    fdp->fd_ofiles[old].fde_flags & ~UF_EXCLOSE;
>>  	if (new > fdp->fd_lastfile)
>>  		fdp->fd_lastfile = new;
>>  	*retval = new;
>> diff --git a/sys/kern/sys_capability.c b/sys/kern/sys_capability.c
>> index e3622b0..2306811 100644
>> --- a/sys/kern/sys_capability.c
>> +++ b/sys/kern/sys_capability.c
>> @@ -409,10 +409,8 @@ sys_cap_ioctls_get(struct thread *td, struct cap_ioctls_get_args *uap)
>>  		if (error != 0)
>>  			goto out;
>>  	}
>> -	if (fdep->fde_nioctls == -1)
>> -		td->td_retval[0] = INT_MAX;
>> -	else
>> -		td->td_retval[0] = fdep->fde_nioctls;
>> +	td->td_retval[0] =
>> +	    fdep->fde_nioctls != -1 ? fdep->fde_nioctls : INT_MAX;
>>  	error = 0;
>>  
>>  out:
> 
> Well, IMHO my version is more readable, especially in the first case.

I had to read these lines very carefully to be sure that they actually assign to the same variables.
Code duplication is really hurts readability.
Maybe &fdp->fd_ofiles[old] and &fdp->fd_ofiles[new] should be placed in local variables.

>> From 1258951ef3cdb8b8624e6a7032036e0e5e0ac8c6 Mon Sep 17 00:00:00 2001
>> From: Christoph Mallon <christoph.mallon at gmx.de>
>> Date: Sun, 24 Feb 2013 15:23:15 +0100
>> Subject: [PATCH 22/24] Unify and simplify bitset inclusion tests.
>>
>> - (have | need) != have looks like a typo: shouldn't the | be a &?.
> 
> Which line exactly?

I did /not/ say, it is a typo.
I said, it /looks like/ a typo due to the | instead of the expected & in a bit test.
Therefore I replaced the test by more conventional ones, which also feature less code duplication.

>> - some places used (have | need) != have, others (have & need) == need.
>> - (need & ~have) != 0 -- need and not have -- is easier to comprehend.
>> - Avoid duplication, especially when the duplicated subexpression is long.
>> ---
>>  sys/kern/kern_descrip.c           |  4 ++--
>>  sys/kern/sys_capability.c         | 11 +++++------
>>  usr.bin/procstat/procstat_files.c |  4 ++--
>>  3 files changed, 9 insertions(+), 10 deletions(-)
>>
>> diff --git a/sys/kern/kern_descrip.c b/sys/kern/kern_descrip.c
>> index 30dac04..64176ca 100644
>> --- a/sys/kern/kern_descrip.c
>> +++ b/sys/kern/kern_descrip.c
>> @@ -1422,9 +1422,9 @@ static void
>>  filecaps_validate(const struct filecaps *fcaps, const char *func)
>>  {
>>  
>> -	KASSERT((fcaps->fc_rights | CAP_MASK_VALID) == CAP_MASK_VALID,
>> +	KASSERT((fcaps->fc_rights & ~CAP_MASK_VALID) == 0,
>>  	    ("%s: invalid rights", func));
>> -	KASSERT((fcaps->fc_fcntls | CAP_FCNTL_ALL) == CAP_FCNTL_ALL,
>> +	KASSERT((fcaps->fc_fcntls & ~CAP_FCNTL_ALL) == 0,
>>  	    ("%s: invalid fcntls", func));
>>  	KASSERT(fcaps->fc_fcntls == 0 || (fcaps->fc_rights & CAP_FCNTL) != 0,
>>  	    ("%s: fcntls without CAP_FCNTL", func));
>> diff --git a/sys/kern/sys_capability.c b/sys/kern/sys_capability.c
>> index 2306811..d06918c 100644
>> --- a/sys/kern/sys_capability.c
>> +++ b/sys/kern/sys_capability.c
>> @@ -148,7 +148,7 @@ static inline int
>>  _cap_check(cap_rights_t have, cap_rights_t need, enum ktr_cap_fail_type type)
>>  {
>>  
>> -	if ((have | need) != have) {
>> +	if ((need & ~have) != 0) {
>>  #ifdef KTRACE
>>  		if (KTRPOINT(curthread, KTR_CAPFAIL))
>>  			ktrcapfail(type, need, have);

>> From 4d721b7aa909091fc705cc8f822a13da69e1954c Mon Sep 17 00:00:00 2001
>> From: Christoph Mallon <christoph.mallon at gmx.de>
>> Date: Sun, 24 Feb 2013 15:33:48 +0100
>> Subject: [PATCH 23/24] Simplify assertion condition, which contains a
>>  duplicated subexpression.
>>
>> ---
>>  sys/kern/kern_descrip.c | 5 ++---
>>  1 file changed, 2 insertions(+), 3 deletions(-)
>>
>> diff --git a/sys/kern/kern_descrip.c b/sys/kern/kern_descrip.c
>> index 64176ca..c2556e2 100644
>> --- a/sys/kern/kern_descrip.c
>> +++ b/sys/kern/kern_descrip.c
>> @@ -1428,9 +1428,8 @@ filecaps_validate(const struct filecaps *fcaps, const char *func)
>>  	    ("%s: invalid fcntls", func));
>>  	KASSERT(fcaps->fc_fcntls == 0 || (fcaps->fc_rights & CAP_FCNTL) != 0,
>>  	    ("%s: fcntls without CAP_FCNTL", func));
>> -	KASSERT(((fcaps->fc_nioctls == -1 || fcaps->fc_nioctls == 0) &&
>> -	         fcaps->fc_ioctls == NULL) ||
>> -	        (fcaps->fc_nioctls > 0 && fcaps->fc_ioctls != NULL),
>> +	KASSERT(fcaps->fc_ioctls != NULL ? fcaps->fc_nioctls > 0 :
>> +	    (fcaps->fc_nioctls == -1 || fcaps->fc_nioctls == 0),
>>  	    ("%s: invalid ioctls", func));
>>  	KASSERT(fcaps->fc_nioctls == 0 || (fcaps->fc_rights & CAP_IOCTL) != 0,
>>  	    ("%s: ioctls without CAP_IOCTL", func));
> 
> I think my version is more readable. Skipped.

You have to read this long and split up expression very carefully, to realize, that the cases are mutually exclusive.
?: makes this immediately clear: Check two different cases, depending on whether it is a null pointer or not.

	Christoph