TCP socket shutdown race condition

Thu Aug 14 07:41:41 PDT 2003

On 13-Aug-2003 Mike Silbersack wrote:
> 
> On Wed, 13 Aug 2003, Ed Maste wrote:
> 
>> I think I've found the problem.
>>
>> crfree() is called from a lot of places (I counted at least 20) including
>> sodealloc() in the socket code, crcopy() etc.  It's called at splnet() from
>> sodealloc().   I'm not sure what spl (if any) it might be called at from
>> elsewhere, but certainly not splnet().
>>
>> I'm going to investigate the correct solution for this and supply a
>> PR / patch, but for now let me know if more information is desired.
>>
>> -ed
> 
> Hm, sounds like you've done some solid debugging, and this should be easy
> to fix.  However, perhaps we need to think about this for a little bit
> longer before we just switch to atomic operations or a spl call within the
> cr functions...
> 
> As I understand it, 4.x uses just a single lock on anything going into the
> kernel, meaning that this type of problem should be prevented.  However,
> maybe there's something a lot more subtle which actually goes on.  What
> I'm thinking is that perhaps we're seeing a single entrypoint which
> happens to call the cr* functions that should be more generally locked,
> and that we're just seeing the problem in the cr functions.
> 
> John, can you give us a quick overview of how 4.x SMP works so that we can
> determine the correct solution here?  My main question is this:  If CPU 1
> is chugging along at a low SPL level and an interrupt comes in to CPU 2,
> can it wrestle control away from the other CPU, and/or run the interrupt
> handler concurrently?

In that case, CPU 2 uses an IPI to "push" the interrupt over to CPU 1
since CPU 1 is in the kernel.  CPU 2 will not handle an interrupt unless
it can get the giant lock.

-- 

John Baldwin <jhb at FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/