git: d0675399d09f - main - capsicum: allow subset of wait4(2) functionality

From: Mariusz Zaborski <oshogbo_at_FreeBSD.org>
Date: Tue, 27 Aug 2024 15:25:06 UTC
The branch main has been updated by oshogbo:

URL: https://cgit.FreeBSD.org/src/commit/?id=d0675399d09f02d347912e23d004329710338450

commit d0675399d09f02d347912e23d004329710338450
Author:     Edward Tomasz Napierala <trasz@FreeBSD.org>
AuthorDate: 2024-08-27 15:19:24 +0000
Commit:     Mariusz Zaborski <oshogbo@FreeBSD.org>
CommitDate: 2024-08-27 15:22:12 +0000

    capsicum: allow subset of wait4(2) functionality
    
    The usual way of handling process exit exit in capsicum(4) mode is
    by using process descriptors (pdfork(2)) instead of the traditional
    fork(2)/wait4(2) API. But most apps hadn't been converted this way,
    and many cannot because the wait is hidden behind a library APIs that
    revolve around PID numbers and not descriptors; GLib's
    g_spawn_check_wait_status(3) is one example.
    
    Thus, provide backwards compatibility by allowing the wait(2) family
    of functions in Capsicum mode, except for child processes created by
    pdfork(2).
    
    Reviewed by:    brooks, oshogbo
    Sponsored by:   Innovate UK
    Differential Revision:  https://reviews.freebsd.org/D44372
---
 contrib/capsicum-test/capmode.cc |  1 -
 lib/libsys/wait.2                | 10 +++++++---
 sys/kern/kern_exit.c             | 12 ++++++++++++
 sys/kern/syscalls.master         |  2 +-
 4 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/contrib/capsicum-test/capmode.cc b/contrib/capsicum-test/capmode.cc
index f32d9e038744..5ff025290211 100644
--- a/contrib/capsicum-test/capmode.cc
+++ b/contrib/capsicum-test/capmode.cc
@@ -594,7 +594,6 @@ FORK_TEST_F(WithFiles, AllowedMiscSyscalls) {
     AWAIT_INT_MESSAGE(pipefds[0], MSG_CHILD_STARTED);
     errno = 0;
     EXPECT_CAPMODE(ptrace_(PTRACE_PEEKDATA_, pid, &pid, NULL));
-    EXPECT_CAPMODE(waitpid(pid, NULL, WNOHANG));
     SEND_INT_MESSAGE(pipefds[0], MSG_PARENT_REQUEST_CHILD_EXIT);
     if (verbose) fprintf(stderr, "  child finished\n");
   }
diff --git a/lib/libsys/wait.2 b/lib/libsys/wait.2
index 8b504e070b7a..3c649f3dfa77 100644
--- a/lib/libsys/wait.2
+++ b/lib/libsys/wait.2
@@ -25,7 +25,7 @@
 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 .\" SUCH DAMAGE.
 .\"
-.Dd June 24, 2022
+.Dd August 27, 2024
 .Dt WAIT 2
 .Os
 .Sh NAME
@@ -605,9 +605,13 @@ must be checked against zero to determine if a process reported status.
 .Pp
 The
 .Fn wait
-family of functions will not return a child process created with
+family of functions will only return a child process created with
 .Xr pdfork 2
-unless specifically directed to do so by specifying its process ID.
+if the calling process is not in
+.Xr capsicum 4
+capability mode, and
+.Nm
+has been explicitly given the child's process ID.
 .Sh ERRORS
 The
 .Fn wait
diff --git a/sys/kern/kern_exit.c b/sys/kern/kern_exit.c
index f83f0433f9cd..f6263cd46d06 100644
--- a/sys/kern/kern_exit.c
+++ b/sys/kern/kern_exit.c
@@ -1330,6 +1330,18 @@ loop_locked:
 			return (0);
 		}
 
+		/*
+		 * When running in capsicum(4) mode, make wait(2) ignore
+		 * processes created with pdfork(2).  This is because one can
+		 * disown them - by passing their process descriptor to another
+		 * process - which means it needs to be prevented from touching
+		 * them afterwards.
+		 */
+		if (IN_CAPABILITY_MODE(td) && p->p_procdesc != NULL) {
+			PROC_UNLOCK(p);
+			continue;
+		}
+
 		nfound++;
 		PROC_LOCK_ASSERT(p, MA_OWNED);
 
diff --git a/sys/kern/syscalls.master b/sys/kern/syscalls.master
index 9fdd443955c7..fac1c2e1e96f 100644
--- a/sys/kern/syscalls.master
+++ b/sys/kern/syscalls.master
@@ -157,7 +157,7 @@
 		    int fd
 		);
 	}
-7	AUE_WAIT4	STD {
+7	AUE_WAIT4	STD|CAPENABLED {
 		int wait4(
 		    int pid,
 		    _Out_opt_ int *status,