panic in sbflush

Robert Watson rwatson at FreeBSD.org
Sat Jun 6 07:25:12 UTC 2009


On Fri, 5 Jun 2009, Barney Cordoba wrote:

> I'm getting a panic in sbflush where mbcnt is 0 and sb_mb is not empty. Any 
> clues as to what might cause this? It happening during a load test.

sbflush() panics are typically symptoms of bugs elsewhere in the network stack 
or kernel, often race conditions.  In essence, sbflush() is called when a 
socket is closed and packets have to be drained from the receive socket 
buffer.  During that draining, we sanity check that the cached length of the 
data in the socket buffer (sb_cc) matches the actual length of data in the 
buffer.  If sb_cc, sb_mb, or sb_mbcnt is non-zero at the end of the function, 
we panic.

Most of the time, it's a driver race condition where an mbuf has been injected 
into the stack using ifp->if_input(), but the driver has then modified the 
mbuf after injection (perhaps by setting a length, clearing a pointer, etc). 
We had a spate of them after we moved to direct dispatch because the timing 
changed, leading to packets being processed before the return of if_input() 
rather than "some time later".

Once in a while it's a bug in TCP or socket buffer handling, or in some 
intermediate encapsulation/decapsulation layer along similar lines to the 
driver race scenario.  I think the most recent case I'm aware of was actually 
a socket buffer bug, but that's fairly unusual in the history of reports of 
this panic.

There is a kernel debugging option to perform run-time sanity checking of the 
sockbuf structure so that the corruption is found earlier, called "options 
SOCKBUF_DEBUG".  My experience is that it's good for finding deterministic 
socket buffer corruption bugs, but that it changes the timing significantly so 
tends to mask narrow race conditions involving "inject the packet and then 
change it".

Hope that helps,

Robert N M Watson
Computer Laboratory
University of Cambridge


More information about the freebsd-net mailing list