panic in sbflush
Robert Watson
rwatson at FreeBSD.org
Sat Jun 6 07:25:12 UTC 2009
On Fri, 5 Jun 2009, Barney Cordoba wrote:
> I'm getting a panic in sbflush where mbcnt is 0 and sb_mb is not empty. Any
> clues as to what might cause this? It happening during a load test.
sbflush() panics are typically symptoms of bugs elsewhere in the network stack
or kernel, often race conditions. In essence, sbflush() is called when a
socket is closed and packets have to be drained from the receive socket
buffer. During that draining, we sanity check that the cached length of the
data in the socket buffer (sb_cc) matches the actual length of data in the
buffer. If sb_cc, sb_mb, or sb_mbcnt is non-zero at the end of the function,
we panic.
Most of the time, it's a driver race condition where an mbuf has been injected
into the stack using ifp->if_input(), but the driver has then modified the
mbuf after injection (perhaps by setting a length, clearing a pointer, etc).
We had a spate of them after we moved to direct dispatch because the timing
changed, leading to packets being processed before the return of if_input()
rather than "some time later".
Once in a while it's a bug in TCP or socket buffer handling, or in some
intermediate encapsulation/decapsulation layer along similar lines to the
driver race scenario. I think the most recent case I'm aware of was actually
a socket buffer bug, but that's fairly unusual in the history of reports of
this panic.
There is a kernel debugging option to perform run-time sanity checking of the
sockbuf structure so that the corruption is found earlier, called "options
SOCKBUF_DEBUG". My experience is that it's good for finding deterministic
socket buffer corruption bugs, but that it changes the timing significantly so
tends to mask narrow race conditions involving "inject the packet and then
change it".
Hope that helps,
Robert N M Watson
Computer Laboratory
University of Cambridge
More information about the freebsd-net
mailing list