kqueue giant-locking (&kq_Giant, locking)

Brian Fundakowski Feldman green at FreeBSD.org
Fri Apr 16 19:12:43 PDT 2004

I believe I have come up with a good solution to the kqueue woes in 5.X, and 
I'd like to get some feedback on work that so far is letting me (on 
uniprocessor, at least) run make -j8 buildworld, with USE_KQUEUE in make(1), 
with no ill effect :)  The locking thus far is one global kqueue lock, and I 
firmly believe we should use MUTEX_PROFILING to determine if we should lock 
it down any further at this point.

There are several major differences so far (of course, fixing that
stack-paged-out-kernel-crash-bug is one of them) and several major
things still to be fixed.

1. The recursion has been removed from kqueue.  This means kqueues cannot be
   added to other kqueues for EVFILT_READ -- yes, that ability has been
   around since r1.1 of kern_event.c, but it is utterly pointless and if you
   take a look at my previous patch, severely complicates many things.  Of
   course, I'm sure someone will notice and complain, but there isn't any
   documentation that suggests you should kevent() another kqueue().
2. Because of this, KNOTE() can't end up calling another KNOTE() unless
   the consumer does something stupid (call KNOTE() from filter::event()).
3. Kqueue does the locking for you when it comes to the non-object lists.
   All of the filter::attach() and filter::detach() routines need to lock
   their object lists, but they don't touch kqueue or knote other than
   setting their own knote's fields.  Both of those routines are called
   without any locks held on kqueue's part.
4. The filter::event() routines are called with internal kqueue locking
   held.  You can lock anything else you need to, but you may not sleep;
   it is essentially like an interrupt handler.  You must not call into
   KNOTE() with locks held, but you should reference your object.  I've
   fixed what appears to be the most egregious offender, sys_pipe.c
5. If KNOTE() as an interrupt does not work for you, you may call KNOTE()
   with any locks you like except the ones it uses internally (mainly
   filedesc and file), but the only information you can give your
   filter::event() is the hint argument.

Examples of #4 are bpf and pipe; they do not need to pass any information
in the filter::event() hint, and as every handler that works on the object
instead of on hints needs to do, they verify for certain whether or not
the KNOTE() should have actually fired and ignore falses.

The biggest example of #5 is process events.  There are many different
process-type locks that may be held when KNOTE() is called, but the
implementation of filter::event() is mostly correct in locking nothing.

In kern_fork.c, KNOTE() is called outside of the proc lock (p1->p_klist not
locked as it should be) because it has to be special-cased somehow.
This is the most disgusting thing EVAR.
(NB: See http://green.homeunix.org/~green/kqueue-locking.1.patch for that.)

Current patch at: http://green.homeunix.org/~green/kqueue-giant-locking.0.patch

Brian Fundakowski Feldman                           \'[ FreeBSD ]''''''''''\
  <> green at FreeBSD.org                               \  The Power to Serve! \
 Opinions expressed are my own.                       \,,,,,,,,,,,,,,,,,,,,,,\

More information about the freebsd-arch mailing list