Special schedulers, one CPU only kernel, one only userland

Wed Aug 17 17:27:53 GMT 2005

On Wednesday 17 August 2005 12:33 pm, Luigi Rizzo wrote:
> On Wed, Aug 17, 2005 at 08:36:04AM -0400, John Baldwin wrote:
> > On Tuesday 16 August 2005 10:13 pm, gnn at freebsd.org wrote:
> > > At Tue, 16 Aug 2005 16:22:41 -0700,
> > >
> > > luigi wrote:
> > > > ok just for the records, the issue i had in mind is the
> > > > release/acquire the mutex around some code that might cause a
> > > > deadlock, not for the mutex ops per se, but for the need to make sure
> > > > that the data structure is consistent before releasing the lock, and
> > > > rechecking
>
> ...
>
> > > Correct me if I'm wrong but I suspect you're thinking of cases such
> > > as:
> > >
> > > 		RT_UNLOCK(rt);		/* XXX workaround LOR */
> > > 		gwrt = rtalloc1(gate, 1, 0);
> > > 		RT_LOCK(rt);
> > >
> > > in the routing/networking code.  A quick, and by no means exhaustive,
> > > check of CURRENT with cscope and Emacs looking for these turned up 3.
>
> actually most network device drivers have a similar thing in
> the receive interrupt processing. The  code below is from if_fxp.c
> but many other drivers have that as well:
>
>                         /*
>                          * Drop locks before calling if_input() since it
>                          * may re-enter fxp_start() in the netisr case.
>                          * This would result in a lock reversal.  Better
>                          * performance might be obtained by chaining all
>                          * packets received, dropping the lock, and then
>                          * calling if_input() on each one.
>                          */
>                         FXP_UNLOCK(sc);
>                         (*ifp->if_input)(ifp, m);
>                         FXP_LOCK(sc);
> 			...
>
> here (and everywhere else)  there is no check that the device is still
> in a consistent state when reacquiring the lock.
>
> How safe is this in the face of, say, device removal, i have no idea,
> because i cannot find a place that comments what the sc->sc_mtx lock is
> supposed to protect.
>
> But in fact, reading fxp_detach() is not very encouraging....
> and the leftover spl*() calls are not making it any easier to understand
> the code.
>
> Actually, if someone can point me to something that documents how
> network locking (device driver and above) is designed, i'd be
> very grateful.

fxp(4)'s locking is somewhat buggy where you are looking probably.  I think 
I've already committed the fixes to HEAD so that detach() is less 
discouraging (we just lock fxp_stop() in detach now).  The calls to 
if_input() are the one place in ethernet drivers where they drop the lock and 
then reacquire it later after they have finished dequeueing a freshly 
received packet from their rx ring.  One suggestion for improvement here is 
to have the various ethernet drivers batch up all the received packets in a 
IFQ private to that function and then at the end of the function drop the 
lock and pass the packets up to if_input so that in the case that multiple 
packets are received, there are fewer lock operations.

-- 
John Baldwin <jhb at FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve"  =  http://www.FreeBSD.org