Re: Kqueues and fork

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Thu, 21 Aug 2025 19:15:44 UTC
On Thu, Aug 21, 2025 at 02:48:28PM -0400, Mark Johnston wrote:
> On Wed, Aug 20, 2025 at 02:11:44PM +0300, Konstantin Belousov wrote:
> > Right now, kqueues fds are marked as not D_PASSABLE, which means that
> > the corresponding file descriptor is not copied into the child filedesc
> > table on fork. The reasoning is that kqueues work on file descriptors,
> > and not even files, so they are tied to the fdesc table.
> > 
> > As a curious coincidence, I have two private discussions last week,
> > where in both cases people were interested in getting more useful
> > behavior on fork from kqueues. [My understanding is that epoll does
> > that, so there is a desire to make kqueue equal in the capability.]
> > 
> > I convinced myself, that indeed kqueues can be copied on fork.
> > Precisely, the proposed semantics is the following:
> > - fdesc copy would allocate a new kqueue as the same fd as the existing
> >   kqueue in the source fdesc
> > - each registered event in the source kqueue is copied into same event
> >   (for the same filter, of course) into the new kqueue
> > - if the event is active at the time of copy, its copy is activated
> >   as well
> > 
> > The prototype in https://reviews.freebsd.org/D52045 gives the naive
> > implementation of the idea.  What I mean by 'naive' is explained in the
> > review summary, where I point out the places requiring (much) more work.
> > 
> > The new copy behavior is requested by the KQUEUE_CPONFORK flag to
> > kqueue1(2).  Existing code that does not specify the flag, gets the old
> > (drop) action on fork.
> > 
> > Example of the usage is provided by https://reviews.freebsd.org/P665.
> > 
> > Before I spend a lot of efforts into finishing this, I want to discuss
> > the proposal:
> > 
> > Is this what the app writers want?
> 
> Looking at your patch, it seems that the child will receive a completely
> separate kqueue, i.e., the queue itself is process-private.  From my
> reading of epoll docs, after fork the child will share the epoll state
> with the parent in some sense.

I do not see how we could share anything because we copy filedesc.

> 
> I wonder if it is really useful for the child process to inherit non-fd
> knotes?  Maybe such knotes should be ignored.

IMO the inheritance of e.g. timer events is the right thing to do.
I do not see why would child not want the signal events, or in fact
most of the non-isfd events.  They are all functionally meaningful
after the fork.

I understand that in specific circumstances child might not want some
kind of events, but it is up to the child code to EV_DELETE then, or
use hypothetical EV_NOCPONFORK flag proposed by Kyle.

If there is some preference to not copy non-isfd events, I can add
two flags to kqueue1() instead of one.  E.g. KQUEUE_CPONFORKFD and
KQUEUE_CPONFORKNONFD, and then
#define KQUEUE_CPONFORK (KQUEUE_CPONFORKFD | KQUEUE_CPONFORKNONFD)