kern/108390: wait4() erroneously waits for all children when SIGCHLD is SIG_IGN

Alan Ferrency alan at pair.com
Fri Jan 26 23:00:36 UTC 2007


>Number:         108390
>Category:       kern
>Synopsis:       wait4() erroneously waits for all children when SIGCHLD is SIG_IGN
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Jan 26 23:00:35 GMT 2007
>Closed-Date:
>Last-Modified:
>Originator:     Alan Ferrency
>Release:        6.2-RELEASE, 6.1-STABLE, 5.5-PRERELEASE
>Organization:
pair Networks, Inc.
>Environment:
FreeBSD <snipped>.pair.com 6.2-RELEASE FreeBSD 6.2-RELEASE #0: Mon Jan 15 22:21:03 EST 2007     mlehner at mayon.pair.com:/usr/obj/usr/src/sys/6PAIRc  i386
FreeBSD <snipped>.pair.com 6.1-STABLE FreeBSD 6.1-STABLE #3: Tue May 16 12:04:45 EDT 2006     cap at pit54.pair.com:/usr/src/sys/i386/compile/PAIR_WS  i386
FreeBSD <snipped>.pair.net 5.5-PRERELEASE FreeBSD 5.5-PRERELEASE #0: Mon Mar  6 14:49:09 EST 2006     root at gw06.pair.net:/usr/obj/usr/src/sys/FIVEGWb  i386
>Description:
If sigaction() is used to set a SIG_IGN handler for SIGCHLD, and then wait4() is used to wait for a specific child process to exit, wait4() erroneously waits for all child processes to exit before it returns.  This is the incorrect behavior when the SA_NOCLDWAIT flag is not specified in sigaction().  However, this is what happens on the systems described above, even when SA_NOCLDWAIT is not specified.

The correct behavior in this case is for wait4() to return as soon as the specified child process exits.  When SIGCHLD is not set to SIG_IGN, wait4() behaves this way.

The sample script, below, works correctly in all cases on FreeBSD 4.8-RELEASE, but fails on the versions of FreeBSD specified above.

Thanks,

Alan Ferrency
>How-To-Repeat:
Included here is a script, as well as sample output from the script when it fails.  The script demonstrates the correct behavior without SIG_IGN, and the failing behavior with it.

Sample output:

default SIGCHLD

1169851899 55091 P  spawned running child
1169851899 55091 P  spawned short running child; waitpid
1169851899 55092 C1 Long child sleeping
1169851899 55093 C2 short child: exiting immediately
1169851899 55091 P  short child finished
1169851899 55091 P  waiting for long running child
1169851909 55092 C1 Long exiting
1169851909 55091 P  long child finished. End of test.


IGNORE SIGCHLD

1169851909 55091 P  spawned running child
1169851909 55091 P  spawned short running child; waitpid
1169851909 55097 C1 Long child sleeping
1169851909 55098 C2 short child: exiting immediately
1169851919 55097 C1 Long exiting
1169851919 55091 P  short child finished
1169851919 55091 P  waiting for long running child
1169851919 55091 P  long child finished. End of test.


The first column of output is epoch timestamp.

In the first case, the first wait4() call on the short-lived child process returns immediately, and then there is a 10 second delay when the second wait4() call waits for the long-running process to complete.

In the second case, the first wait4() call on the short-lived child PID waits 10 seconds for the long-lived child to finish, before it returns.  Then, there is no delay at the second wait4() call.  This is incorrect.

The failure script was compiled with gcc, with no additional libraries or compile time options:

#include <signal.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define SLEEP 10

void test_it (char *);

int main (void) {

  struct sigaction act;

  /* With the default SIGCHLD, everything works as expected. */
  test_it("default SIGCHLD");

  /* When we set up sig CHLD to ignore, it causes our wait4 to wait for
     all PIDs not just one of them */
  act.sa_handler = SIG_IGN;
  sigemptyset(&act.sa_mask);
  act.sa_flags = 0;
  
  sigaction(SIGCHLD, &act, 0);
  test_it("IGNORE SIGCHLD");

  exit(0);
}

void Log (char *msg) {
  printf("%10d %5d %s\n", time(NULL), getpid(), msg);
}

void test_it (char *msg) {
  int long_pid, short_pid, stat;

  printf ("\n\n%s\n\n", msg);

  /* Fork a long running child */
  if (long_pid = fork()) {
    Log("P  spawned running child");

    /* Fork a short running child */
    if (short_pid = fork()) {
      Log("P  spawned short running child; waitpid"); 
      wait4(short_pid, &stat, 0, 0);
      Log("P  short child finished");
    } else { /* the short running child */
      Log("C2 short child: exiting immediately");
      exit(0);
    }
    Log("P  waiting for long running child");
    wait4(long_pid, &stat, 0, 0);
    Log("P  long child finished. End of test.");
  } else { /* the long running child*/
    Log("C1 Long child sleeping");
    sleep(SLEEP);
    Log("C1 Long exiting");
    exit(0);
  }
}

>Fix:
Unknown
>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list