misc/185873: waitpid() in linux threads fails with ECHILD

Henry Hu henry.hu.sh at gmail.com
Sun Jan 19 03:10:00 UTC 2014


>Number:         185873
>Category:       misc
>Synopsis:       waitpid() in linux threads fails with ECHILD
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Jan 19 03:10:00 UTC 2014
>Closed-Date:
>Last-Modified:
>Originator:     Henry Hu
>Release:        FreeBSD 11-CURRENT
>Organization:
Columbia University
>Environment:
FreeBSD pepsi 11.0-CURRENT FreeBSD 11.0-CURRENT #1 r260031M: Sun Jan  5 18:25:51 EST 2014     root at pepsi:/usr/obj/usr/src/sys/MYKERNEL  amd64

>Description:
If a Linux program
1. use fork() to fork a child
2. create a thread
3. use waitpid() in the new thread to wait for the child

Then waitpid() returns -1, with errno ECHILD.

This affects some applications:
1. intellij (experimental port available) with Oracle JDK
2. android studio (based on intellij)
and possible other programs using Oracle JDK.

If you use java.lang.ProcessBuilder to create a child and get its output, Oracle SDK works in this way:
1. use fork() to create a child
2. create a thread to wait for the child
3. the thread calls waitpid() to wait for child exit
4. the thread reads the child's output

Because waitpid() incorrectly returns -1 here before the child exits, the child may have not produced the output, which results in empty output received by the caller. The caller may incorrectly assume that the child is not working.

In intellij, it calls "git --version" to obtain the git version. Because the output is empty, it assumes that git is not working, and disables some features.
>How-To-Repeat:
A simple test program:

#include <stdio.h>
#include <pthread.h>
#include <sys/wait.h>
#include <unistd.h>

int child;

void* worker(void* arg) {
	int status;
	printf("worker waiting\n");
	int ret = waitpid(child, &status, 0);
	printf("waitpid ret: %d status: %d\n", ret, status);
	return NULL;
}

int main() {
	child = fork();
	if (child == 0) {
		printf("child running\n");
		sleep(3);
		printf("child exit\n");
	} else {
		printf("forked: %d\n", child);
		pthread_t thr;
		pthread_create(&thr, NULL, worker, NULL);
		sleep(5);
	}
}

If run it natively on a FreeBSD/Linux machine, it outputs

forked: 98484
child running
worker waiting
child exit
waitpid ret: 98484 status: 0

However, if run a Linux version on a FreeBSD machine, it outputs

forked: 95940
child running
worker waiting
waitpid ret: -1 status: 0
child exit
>Fix:


>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list