Kernel panics in tcp_twclose

Palle Girgensohn girgen at FreeBSD.org
Fri Sep 25 14:14:19 UTC 2015


> 24 sep 2015 kl. 11:39 skrev Palle Girgensohn <girgen at FreeBSD.org>:
> 
> 
>> 24 sep 2015 kl. 09:57 skrev Julien Charbon <jch at FreeBSD.org>:
>> 
>> 
>> Hi -net,
>> 
>> On 24/09/15 09:03, Julien Charbon wrote:
>>> On 24/09/15 08:55, Palle Girgensohn wrote:
>>>>> 24 sep 2015 kl. 07:51 skrev Palle Girgensohn
>>>>> <girgen at pingpong.net>:
>>>>>> 24 sep 2015 kl. 00:05 skrev Palle Girgensohn
>>>>>> <girgen at pingpong.net>:
>>>>>>> 23 sep 2015 kl. 20:32 skrev Julien Charbon <jch at freebsd.org>: 
>>>>>>> On 23/09/15 20:26, Palle Girgensohn wrote:
>>>>>> Kernels and userland are updated to 10.2-p3 with the patch
>>>>>> removing the suspicous KASSERT.
>>>>>> dtrace running continously redirecting to a log file.
>>>> Just had a crash. Unfortunately, the kernel was stuck at the db>
>>>> prompt, and the remote keyboard was unresponsive (HP ILO, not
>>>> impressed). So I had to reset the power and never got a core dump...
>>>> 
>>>> panic: tcp_tw_2msl_stop: inp should not be released here
>>>> cpuid = 0
>>>> KDB: stack backtrace:
>>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
>>>> 0xfffffe175acd16a0 kdb_backtrace() at kdb_backtrace+0x39/frame
>>>> 0xfffffe175acd1750 vpanic() at vpanic+0x126/frame 0xfffffe175acd1790
>>>> kassert_panic() at kassert_panic+0x139/frame 0xfffffe175acd1800
>>>> tcp_twclose() at tcp_twclose+0x2cb/frame 0xfffffe175acd1850
>>>> tcp_tw_2msl_scan() at tcp_tw_2msl_scan+0x13b/frame
>>>> 0xfffffe175acd1890 tcp_slowtimo() at tcp_slowtimo+0x68/frame
>>>> 0xfffffe175acd18c0 pfslowtimo() at pfslowtimo+0x54/frame
>>>> 0xfffffe175acd18f0 softclock_call_cc() at
>>>> softclock_call_cc+0x193/frame 0xfffffe175acd19d0 softclock() at
>>>> softclock+0x47/frame 0xfffffe175acd19f0 intr_event_execute_handlers()
>>>> at intr_event_execute_handlers+0x93/frame 0xfffffe 175acd1a30
>>>> ithread_loop() at ithread_loop+0xa6/frame 0xfffffe175acd1a70
>>>> fork_exit() at fork_exit+0x84/frame 0xfffffe175acd1ab0
>>>> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe175acd1ab0
>>>> --- trap 0, rip = 0, rsp = 0xfffffe175acd1b70, rbp = 0 ---
>>>> KDB: enter: panic
>>>> [ thread pid 12 tid 100043 ]
>>>> Stopped at      kdb_enter+0x3e: movq    $0,kdb_why
>>>> db>
>>> 
>>> Thanks a log for this backstrace.  This is what at expected, when
>>> tcp_close() in call in INP_TIMEWAIT case, in_pcbfree() can be called one
>>> extra time that leads to:
>>> 
>>> tcp_tw_2msl_stop: inp should not be released here
>>> 
>>> Let me try to come with a tentative fix for this case.
>> 
>> See joined my tentative patch for these case.  It is only a first
>> tentative patch as I am still waiting on -net feedbacks on what should
>> be the rule here.
>> 
>> By the way:
>> 
>> - I see nothing specific to VIMAGE here
>> 
>> - Anyone aware of tcp_close() (or tcp_drop()) calls modified/introduced
>> recently in 10.2 that could explained why this issue only appears only now?
>> 
>> --
>> Julien
>> <tcp-close-fix-v1.patch>
> 
> 
> Running a machine with the patch now (it just crashed and rebooted with the new kernel).
> 
> Hoping it will have a "soothing" effect... ;-)
> 
> 
> dtrace running as previously. No output yet, though.
> 
> 

Hello -net & Julien!

First of, loud cheers and a big *thank you* to Julien for helping us get our systems to stop crashing. This really means a lot to us! Thank you!

We have been running more than 24 hours with no crash, so I'm getting more and more confident that the change acually makes the system stable.

Dtrace still shows nothing.

Palle



More information about the freebsd-net mailing list