Kernel panics in tcp_twclose

Palle Girgensohn girgen at FreeBSD.org
Fri Sep 25 14:19:27 UTC 2015


> 25 sep 2015 kl. 16:14 skrev Palle Girgensohn <girgen at FreeBSD.org>:
> 
>> 
>> 24 sep 2015 kl. 11:39 skrev Palle Girgensohn <girgen at FreeBSD.org>:
>> 
>> 
>>> 24 sep 2015 kl. 09:57 skrev Julien Charbon <jch at FreeBSD.org>:
>>> 
>>> 
>>> Hi -net,
>>> 
>>> On 24/09/15 09:03, Julien Charbon wrote:
>>>> On 24/09/15 08:55, Palle Girgensohn wrote:
>>>>>> 24 sep 2015 kl. 07:51 skrev Palle Girgensohn
>>>>>> <girgen at pingpong.net>:
>>>>>>> 24 sep 2015 kl. 00:05 skrev Palle Girgensohn
>>>>>>> <girgen at pingpong.net>:
>>>>>>>> 23 sep 2015 kl. 20:32 skrev Julien Charbon <jch at freebsd.org>: 
>>>>>>>> On 23/09/15 20:26, Palle Girgensohn wrote:
>>>>>>> Kernels and userland are updated to 10.2-p3 with the patch
>>>>>>> removing the suspicous KASSERT.
>>>>>>> dtrace running continously redirecting to a log file.
>>>>> Just had a crash. Unfortunately, the kernel was stuck at the db>
>>>>> prompt, and the remote keyboard was unresponsive (HP ILO, not
>>>>> impressed). So I had to reset the power and never got a core dump...
>>>>> 
>>>>> panic: tcp_tw_2msl_stop: inp should not be released here
>>>>> cpuid = 0
>>>>> KDB: stack backtrace:
>>>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
>>>>> 0xfffffe175acd16a0 kdb_backtrace() at kdb_backtrace+0x39/frame
>>>>> 0xfffffe175acd1750 vpanic() at vpanic+0x126/frame 0xfffffe175acd1790
>>>>> kassert_panic() at kassert_panic+0x139/frame 0xfffffe175acd1800
>>>>> tcp_twclose() at tcp_twclose+0x2cb/frame 0xfffffe175acd1850
>>>>> tcp_tw_2msl_scan() at tcp_tw_2msl_scan+0x13b/frame
>>>>> 0xfffffe175acd1890 tcp_slowtimo() at tcp_slowtimo+0x68/frame
>>>>> 0xfffffe175acd18c0 pfslowtimo() at pfslowtimo+0x54/frame
>>>>> 0xfffffe175acd18f0 softclock_call_cc() at
>>>>> softclock_call_cc+0x193/frame 0xfffffe175acd19d0 softclock() at
>>>>> softclock+0x47/frame 0xfffffe175acd19f0 intr_event_execute_handlers()
>>>>> at intr_event_execute_handlers+0x93/frame 0xfffffe 175acd1a30
>>>>> ithread_loop() at ithread_loop+0xa6/frame 0xfffffe175acd1a70
>>>>> fork_exit() at fork_exit+0x84/frame 0xfffffe175acd1ab0
>>>>> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe175acd1ab0
>>>>> --- trap 0, rip = 0, rsp = 0xfffffe175acd1b70, rbp = 0 ---
>>>>> KDB: enter: panic
>>>>> [ thread pid 12 tid 100043 ]
>>>>> Stopped at      kdb_enter+0x3e: movq    $0,kdb_why
>>>>> db>
>>>> 
>>>> Thanks a log for this backstrace.  This is what at expected, when
>>>> tcp_close() in call in INP_TIMEWAIT case, in_pcbfree() can be called one
>>>> extra time that leads to:
>>>> 
>>>> tcp_tw_2msl_stop: inp should not be released here
>>>> 
>>>> Let me try to come with a tentative fix for this case.
>>> 
>>> See joined my tentative patch for these case.  It is only a first
>>> tentative patch as I am still waiting on -net feedbacks on what should
>>> be the rule here.
>>> 
>>> By the way:
>>> 
>>> - I see nothing specific to VIMAGE here
>>> 
>>> - Anyone aware of tcp_close() (or tcp_drop()) calls modified/introduced
>>> recently in 10.2 that could explained why this issue only appears only now?
>>> 
>>> --
>>> Julien
>>> <tcp-close-fix-v1.patch>
>> 
>> 
>> Running a machine with the patch now (it just crashed and rebooted with the new kernel).
>> 
>> Hoping it will have a "soothing" effect... ;-)
>> 
>> 
>> dtrace running as previously. No output yet, though.
>> 
>> 
> 
> Hello -net & Julien!
> 
> First of, loud cheers and a big *thank you* to Julien for helping us get our systems to stop crashing. This really means a lot to us! Thank you!
> 
> We have been running more than 24 hours with no crash, so I'm getting more and more confident that the change acually makes the system stable.
> 
> Dtrace still shows nothing.
> 
> Palle


Secondly, is this error related? This is *not* VIMAGE, *not* jail. It is a binary installed GENERIC from freebsd-update. 10.1-RELEASE-p19. It just crashed today, and we did not get any core dump, but I found this core.txt from a crash in August that I was not aware of (I was on holiday then... :)

Since it is installed binary, I have no kernel.debug.

...

panic: sbsndptr: sockbuf 0xfffff80312126c68 and mbuf 0xfffff800b4a36800 clashing

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:
panic: sbsndptr: sockbuf 0xfffff80312126c68 and mbuf 0xfffff800b4a36800 clashing
cpuid = 1
KDB: stack backtrace:
#0 0xffffffff80963000 at kdb_backtrace+0x60
#1 0xffffffff80928125 at panic+0x155
#2 0xffffffff8099c180 at sbdroprecord_locked+0
#3 0xffffffff80ac8c9c at tcp_output+0xdbc
#4 0xffffffff80ac6a95 at tcp_do_segment+0x3045
#5 0xffffffff80ac2e04 at tcp_input+0xd04
#6 0xffffffff80a54fc7 at ip_input+0x97
#7 0xffffffff809f4f73 at swi_net+0x143
#8 0xffffffff808faf4b at intr_event_execute_handlers+0xab
#9 0xffffffff808fb396 at ithread_loop+0x96
#10 0xffffffff808f8b6a at fork_exit+0x9a
#11 0xffffffff80d0b67e at fork_trampoline+0xe
Uptime: 21d0h54m53s
Dumping 2005 out of 32709 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

Reading symbols from /boot/kernel/accf_data.ko.symbols...done.
Loaded symbols for /boot/kernel/accf_data.ko.symbols
Reading symbols from /boot/kernel/accf_http.ko.symbols...done.
Loaded symbols for /boot/kernel/accf_http.ko.symbols
Reading symbols from /boot/kernel/oce.ko.symbols...done.
Loaded symbols for /boot/kernel/oce.ko.symbols
Reading symbols from /boot/kernel/nullfs.ko.symbols...done.
Loaded symbols for /boot/kernel/nullfs.ko.symbols
Reading symbols from /boot/kernel/linprocfs.ko.symbols...done.
Loaded symbols for /boot/kernel/linprocfs.ko.symbols
Reading symbols from /boot/kernel/linux.ko.symbols...done.
Loaded symbols for /boot/kernel/linux.ko.symbols
Reading symbols from /boot/kernel/zfs.ko.symbols...done.
Loaded symbols for /boot/kernel/zfs.ko.symbols
Reading symbols from /boot/kernel/opensolaris.ko.symbols...done.
Loaded symbols for /boot/kernel/opensolaris.ko.symbols
#0  doadump (textdump=<value optimized out>) at pcpu.h:219
219	pcpu.h: No such file or directory.
	in pcpu.h
(kgdb) #0  doadump (textdump=<value optimized out>) at pcpu.h:219
#1  0xffffffff80927da2 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:452
#2  0xffffffff80928164 in panic (fmt=<value optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:759
#3  0xffffffff8099c180 in sbsndptr (sb=<value optimized out>, 
    off=<value optimized out>, len=<value optimized out>, 
    moff=<value optimized out>) at /usr/src/sys/kern/uipc_sockbuf.c:1011
#4  0xffffffff80ac8c9c in tcp_output (tp=0xfffff80312ef5800)
    at /usr/src/sys/netinet/tcp_output.c:870
#5  0xffffffff80ac6a95 in tcp_do_segment (m=<value optimized out>, 
    th=<value optimized out>, so=<value optimized out>, 
    tp=<value optimized out>, drop_hdrlen=<value optimized out>, tlen=0, 
    iptos=<value optimized out>, ti_locked=Cannot access memory at address 0x1
)
    at /usr/src/sys/netinet/tcp_input.c:3018
#6  0xffffffff80ac2e04 in tcp_input (m=<value optimized out>, 
    off0=<value optimized out>) at /usr/src/sys/netinet/tcp_input.c:1377
#7  0xffffffff80a54fc7 in ip_input (m=0xfffff800b4516600)
    at /usr/src/sys/netinet/ip_input.c:734
#8  0xffffffff809f4f73 in swi_net (arg=0xffffffff81988880)
    at /usr/src/sys/net/netisr.c:765
#9  0xffffffff808faf4b in intr_event_execute_handlers (
    p=<value optimized out>, ie=0xfffff800093ac600)
    at /usr/src/sys/kern/kern_intr.c:1263
#10 0xffffffff808fb396 in ithread_loop (arg=0xfffff80009388e40)
    at /usr/src/sys/kern/kern_intr.c:1276
#11 0xffffffff808f8b6a in fork_exit (
    callout=0xffffffff808fb300 <ithread_loop>, arg=0xfffff80009388e40, 
    frame=0xfffffe083c3e3ac0) at /usr/src/sys/kern/kern_fork.c:996
#12 0xffffffff80d0b67e in fork_trampoline ()
    at /usr/src/sys/amd64/amd64/exception.S:606
#13 0x0000000000000000 in ?? ()
Current language:  auto; currently minimal
(kgdb) 






More information about the freebsd-net mailing list