New "timeout" api, to replace callout
Robert Watson
rwatson at FreeBSD.org
Fri Dec 28 02:30:15 PST 2007
On Fri, 28 Dec 2007, Hans Petter Selasky wrote:
>> The reason you need to do a drain is to allow for safe destroying of the
>> lock. Specifically, drivers tend to do this:
>>
>> FOO_LOCK(sc);
>> ...
>> callout_stop(...);
>> FOO_UNLOCK(sc);
>> ...
>> callout_drain(...);
>> ...
>> mtx_destroy(&sc->foo_mtx);
>>
>> If you don't have the drain and softclock is trying to acquire the backing
>> mutex while you have it held (before the callout_stop) then Bad Things can
>> happen if you don't do the drain. Having the lock just "give up" doesn't
>> work either because if the memory containing the lock is free'd and
>> reinitialized such that it looks enough like a valid lock then softclock
>> (or its equivalent) will still try to obtain it. Also, you need to do a
>> drain so it is safe to free the callout structure to prevent it from being
>> recycled and having weird races where it gets recycled and rescheduled but
>> the timer code thinks it has a pending stop for that pointer and so it
>> aborts the wrong instance of the timer, etc.
>
> I completely agree to what John Baldwin is writing. You need two
> stop-functions:
>
> xxx_stop which is non-blocking and xxx_drain which can block i.e. sleep
>
> BTW: The USB code in P4 uses the same semantics, due to the same reasons:
>
> usbd_transfer_stop() and usbd_transfer_drain()
>
> The only difference is that I pass an error code to the callback which might
> happen after that usbd_transfer_stop is called.
>
> I think that xxx_stop() and xxx_drain() is a generic approach that should be
> applied to all callback systems. Whenever you have a callback you need to be
> able to stop it and drain it.
I think the argument that Poul-Henning is making is not that you don't need
something that behaves "like drain", but rather, we're like the wait for drain
to be a short, mutex-length wait rather than a long, msleep-length wait.
Remember that the bodies of callouts are expected to run in a very short
period of time in order to not stall the timer system, in fact, in such a way
that a mutex could be held over the entirely timeout call. Given that this is
the case, one might reasonably expect callout_stop() to perform the drain
rather than having a separate call. Such a model would be very advantageous
in TCP, where rather than having to defer GC'ing the inpcb/tcpcb to a GC
worker thread, which we don't do but should, we could use the stop call safely
and eliminate a whole class of races from the stack.
Robert N M Watson
Computer Laboratory
University of Cambridge
More information about the freebsd-arch
mailing list