[Differential] [Commented On] D1777: Associated fix for arp/nd6 timer usage.

rrs (Randall Stewart) phabric-noreply at FreeBSD.org
Wed Feb 4 22:26:58 UTC 2015


rrs added a comment.

I don't think this is a refcnt issue bz, the base of this is a hole in the way
the callout code works. Basically there is a window when

a) The callout_wheel is executing, it sees that a "lock" has been configured, so it goes to
     release the callout wheel lock and then lock the callout init'd lock

b) At that time some other cpu has the lock (that was inited on the callout), and it then
    runs a callout_stop (not drain). This cause the callout to  "stop" the callout from running
    (which it can do). It sets a flag on the callout and returns to the caller. The caller (lle in this case)
    proceeds to delete the ref cnt since the callout was stopped (and it is it won't be run). It then
    in the end purges the memory.

c) Now we resume <a> above and it now de-ref's the lock.

This window is not avoidable with the way the current callout code is architected. It can only
be avoided by the caller getting the lock not the callout system. That way it won't de-ref
the lock and blow up when it hits deleted memory.

There may be other ways to fix this, but I don't know how we can change the callout
system to handle it.. Even Han's re-write has this same problem if you use the callout_stop and
not callout_drain*

REVISION DETAIL
  https://reviews.freebsd.org/D1777

To: rrs, jhb, imp, sbruno, gnn, rwatson, lstewart, kostikbel, adrian, bz
Cc: bz, emaste, hiren, julian, hselasky, freebsd-net


More information about the freebsd-net mailing list