kern/180385: [kqueue] Conflict between EVFILT_PROC NOTE_CHILD and NOTE_EXIT use of data field

David A. Bright David_A_Bright at DELL.com
Mon Jul 8 14:50:00 UTC 2013


>Number:         180385
>Category:       kern
>Synopsis:       [kqueue] Conflict between EVFILT_PROC NOTE_CHILD and NOTE_EXIT use of data field
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Jul 08 14:50:00 UTC 2013
>Closed-Date:
>Last-Modified:
>Originator:     David A. Bright
>Release:        9.1-RELEASE-p4
>Organization:
Dell | Compellent
>Environment:
FreeBSD localhost.local 9.1-RELEASE-p4 FreeBSD 9.1-RELEASE-p4 #0: Mon Jun 17 11:42:37 UTC 2013     root at amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64

>Description:
There is a bug in the kevent EVFILT_PROC handling, possibly introduced by a mod in 2006:

http://svnweb.freebsd.org/base/head/sys/kern/kern_event.c?r1=164450&r2=164451

The scenario is a process that spawns a bunch of other processes and uses the kevent EVFILT_PROC, NOTE_TRACK facility to keep tabs on them, maintaining a process history including process parent/child relationships and whether the processes are running or have exited. Consider a process that does a fork() and then the child attempts to exec() a non-existent program and does an exit(127) on the exec() failure. This can result in a single kevent that returns both the NOTE_CHILD and NOTE_EXIT fflags.

Unfortunately, both of these fflags are defined to return something in kevent.data (the parent pid (ppid) for NOTE_CHILD and the exit status for NOTE_EXIT). Obviously, kevent.data can't contain both pieces of information. In fact, what gets returned is the exit status, so when the receiving code tries to interpret the NOTE_CHILD it appears that the ppid is 32512 (127 << 8), which cam really throws off the process tracking.

Before the mod mentioned above the exit status was not returned, so the returned ppid for the NOTE_CHILD would have been correct (sys/kern/kern_event.c, file_procattach(), about line 368 in head), although the returned ppid would not make much sense as an exit status for the NOTE_EXIT.

What I think should happen is that when filt_proc() (about kern_event.c line 433) is going to set the exit status, it should first check if NOTE_CHILD is already set in the knote and, if so, allocate a new knote for the NOTE_EXIT and queue it after the existing NOTE_CHILD knote (some adjustment would need to be made around line 424, too, since that is where the NOTE_EXIT was set). This would guarantee that the NOTE_CHILD was received before the NOTE_EXIT and that the appropriate piece of data could accompany each NOTE_.

I don't know if there is a significant concern about allocating a new knote at that point. If that were the case and the ideal behavior I described were not possible, I would think that it would be more appropriate to return the ppid in the knote and not set the kn->kn_data field to the exit status iff the NOTE_CHILD fflag was set, thereby giving it precedence. That would at least work much better for my particular situation!  If you receive a kevent with both NOTE_CHILD and NOTE_EXIT set, you might be able to presume that the child probably failed; in any case you would know for sure that it had exited and what
process was its parent.

I've exchanged email with jhb on the problem and he indicated that it might be a while before he could get to it. I'll take a shot at it myself, but it will probably be at least a couple weeks before I can do so. I wanted to file this PR so that it was out there in case someone else might be able to get to it sooner and also so that it isn't forgotten.
>How-To-Repeat:
This is a timing related thing, but doing a fork() and then exiting immediately in the child is likely to show the problem fairly often.
>Fix:
jhb suggested:

"Hmmm, this might be fixable by adding a f_touch method to the EVFILT_PROC
handling and having it notice the two states and break them up."


>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list