kern/141439: linux_exit_group kills group leader

Stefan Schmidt stefan.schmidt at stadtbuch.de
Sun Dec 13 23:20:02 UTC 2009


>Number:         141439
>Category:       kern
>Synopsis:       linux_exit_group kills group leader
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Dec 13 23:20:01 UTC 2009
>Closed-Date:
>Last-Modified:
>Originator:     Stefan Schmidt
>Release:        FreeBSD 8.0-STABLE as of 2009-12-13
>Organization:
>Environment:
FreeBSD shuttle.stadtbuch.de 8.0-STABLE FreeBSD 8.0-STABLE #0: Sun Dec 13 21:08:07 CET 2009     root at shuttle.stadtbuch.de:/usr/obj/usr/src/sys/SHUTTLE  amd64

>Description:
Using the 32-bit Linux version of Sun's Java Development Kit 1.6 (Update 17) on FreeBSD 8.0 (amd64), invocations of "javac" (or "java") eventually end with the output of "Killed" and exit code 137.

This is particularly annoying when running e.g. JUnit-tests in a separate process. The calling process always receives exit code 137 from its sub-process and assumes that tests failed.
>How-To-Repeat:
shuttle# /usr/local/linux-sun-jdk1.6.0_17/bin/javac -version
javac 1.6.0_17
Killed
shuttle# echo $?
137
shuttle#
>Fix:
It seems that linux_exit_group (in linux_misc.c) unconditionally sends SIGKILL to the leader of the thread group. This results in exit code 137 and the output of "Killed" as shown above.

I tried to modify linux_exit_group to not kill the group leader and to set the exit code appropriately. While this modified linux_exit_group works fine for me, I doubt it is correct (e.g. regarding locking). I also had a closer look at NetBSD's version of linux_exit_group, but could not get NetBSD's implementation approach to work (that is setting p_xstat in linuxulator's process exit hook). Don't know (haven't tried) whether it really works as expected in NetBSD anyway.

Please find my patch attached.

Patch attached with submission follows:

Index: sys/compat/linux/linux_emul.c
===================================================================
RCS file: /home/ncvs/src/sys/compat/linux/linux_emul.c,v
retrieving revision 1.23.2.1
diff -u -w -d -r1.23.2.1 linux_emul.c
--- sys/compat/linux/linux_emul.c	3 Aug 2009 08:13:06 -0000	1.23.2.1
+++ sys/compat/linux/linux_emul.c	13 Dec 2009 21:22:40 -0000
@@ -96,6 +96,7 @@
 			s = malloc(sizeof *s, M_LINUX, M_WAITOK | M_ZERO);
 			s->refs = 1;
 			s->group_pid = child;
+			s->xstat = 0;
 
 			LIST_INIT(&s->threads);
 			em->shared = s;
Index: sys/compat/linux/linux_emul.h
===================================================================
RCS file: /home/ncvs/src/sys/compat/linux/linux_emul.h,v
retrieving revision 1.10.2.1
diff -u -w -d -r1.10.2.1 linux_emul.h
--- sys/compat/linux/linux_emul.h	3 Aug 2009 08:13:06 -0000	1.10.2.1
+++ sys/compat/linux/linux_emul.h	13 Dec 2009 21:22:18 -0000
@@ -36,6 +36,8 @@
 	pid_t	group_pid;
 
 	LIST_HEAD(, linux_emuldata) threads; /* head of list of linux threads */
+	
+	int xstat;
 };
 
 /*
Index: sys/compat/linux/linux_misc.c
===================================================================
RCS file: /home/ncvs/src/sys/compat/linux/linux_misc.c,v
retrieving revision 1.240.2.2
diff -u -w -d -r1.240.2.2 linux_misc.c
--- sys/compat/linux/linux_misc.c	16 Sep 2009 13:24:37 -0000	1.240.2.2
+++ sys/compat/linux/linux_misc.c	13 Dec 2009 22:48:36 -0000
@@ -1708,10 +1708,19 @@
 
 		KASSERT(td_em != NULL, ("exit_group: emuldata not found.\n"));
 
+		if (td_em->shared->refs == 1) {
+		    exit1(td, td_em->shared->xstat ? td_em->shared->xstat : W_EXITCODE(args->error_code, 0));
+		    return (0);
+		}
+		
 		EMUL_SHARED_RLOCK(&emul_shared_lock);
+
+		td_em->shared->xstat = W_EXITCODE(args->error_code, 0);
+
 		LIST_FOREACH_SAFE(em, &td_em->shared->threads, threads, tmp_em) {
-			if (em->pid == td_em->pid)
+			if ((em->pid == td_em->pid) || (em->pid == td_em->shared->group_pid))  {
 				continue;
+                        }
 
 			sp = pfind(em->pid);
 			psignal(sp, SIGKILL);
@@ -1721,8 +1730,18 @@
 #endif
 		}
 
+		if (td->td_proc->p_pid != td_em->shared->group_pid) {
+#ifdef DEBUG
+                    printf(LMSG("linux_sys_exit_group: kill PID %d\n"), td->td_proc->p_pid);
+#endif
+		    psignal(td->td_proc, SIGKILL);
+                }
+                
 		EMUL_SHARED_RUNLOCK(&emul_shared_lock);
+		
+		return (0);
 	}
+
 	/*
 	 * XXX: we should send a signal to the parent if
 	 * SIGNAL_EXIT_GROUP is set. We ignore that (temporarily?)


>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list