stable/11: repeatable panic due to a race

Eugene Grosbein eugen at grosbein.net
Fri Jun 16 11:39:33 UTC 2017


Hi!

I observe repeatable panics at netgraph level while doing stress test
for net/mpd5 daemon under stable/11 r317184.

It connects, uses and disconnects lots of ngXX interfaces and corresponding
netgraph nodes and hooks. I have good crashdump for INVARIANTS-enabled
custom kernel built with debugging symbols and COPFTFLAGS=-O0.

It seems that netgraph ng_iface's node hook may get disconnected and removed
in parallel with ng_iface_send(). I see no locking to prevent this in the source.

Here is kgdb script. Note that netgraph node's "hook" points to memory area
filled with "0xdeadc0de" pattern that means it has already been freed by UMA.

The machine has two physical CPU cores.

Command: kgdb kernel.debug /var/crash/vmcore.1
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:


Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer     = 0x20:0xffffffff8097f249
stack pointer           = 0x28:0xfffffe0239542ec0
frame pointer           = 0x28:0xfffffe0239542f00
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 28999 (ftpd)
trap number             = 9
panic: general protection fault
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2c/frame 0xfffffe0239542840
kdb_backtrace() at kdb_backtrace+0x53/frame 0xfffffe0239542910
vpanic() at vpanic+0x249/frame 0xfffffe02395429e0
kproc_shutdown() at kproc_shutdown/frame 0xfffffe0239542a40
trap_fatal() at trap_fatal+0x60a/frame 0xfffffe0239542b70
trap() at trap+0x97c/frame 0xfffffe0239542dd0
trap_check() at trap_check+0x15/frame 0xfffffe0239542df0
calltrap() at calltrap+0x8/frame 0xfffffe0239542df0
--- trap 0x9, rip = 0xffffffff8097f249, rsp = 0xfffffe0239542ec0, rbp = 0xfffffe0239542f00 ---
ng_address_hook() at ng_address_hook+0x59/frame 0xfffffe0239542f00
ng_iface_send() at ng_iface_send+0x108/frame 0xfffffe0239542f90
ng_iface_output() at ng_iface_output+0x447/frame 0xfffffe0239543060
ip_output() at ip_output+0x1864/frame 0xfffffe0239543300
tcp_output() at tcp_output+0x2602/frame 0xfffffe02395436a0
tcp_disconnect() at tcp_disconnect+0x18e/frame 0xfffffe02395436e0
tcp_usr_disconnect() at tcp_usr_disconnect+0xe6/frame 0xfffffe0239543710
sodisconnect() at sodisconnect+0x62/frame 0xfffffe0239543740
soclose() at soclose+0x95/frame 0xfffffe02395437b0
soo_close() at soo_close+0x4d/frame 0xfffffe02395437e0
fo_close() at fo_close+0x31/frame 0xfffffe0239543810
_fdrop() at _fdrop+0x46/frame 0xfffffe0239543840
closef() at closef+0x2d7/frame 0xfffffe02395438f0
closefp() at closefp+0xde/frame 0xfffffe0239543940
kern_close() at kern_close+0xe7/frame 0xfffffe0239543990
sys_close() at sys_close+0x1f/frame 0xfffffe02395439b0
syscallenter() at syscallenter+0x4ff/frame 0xfffffe0239543a80
amd64_syscall() at amd64_syscall+0x2a/frame 0xfffffe0239543bb0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0239543bb0
--- syscall (6, FreeBSD ELF64, sys_close), rip = 0x801a4033a, rsp = 0x7fffffffd0a8, rbp = 0x7fffffffd0d0 ---
Uptime: 2h11m5s
Dumping 544 out of 8156 MB:..3%..12%..21%..33%..42%..53%..62%..71%..83%..92%

Reading symbols from /boot/modules/geom_mirror.ko...done.
Loaded symbols for /boot/modules/geom_mirror.ko
Reading symbols from /boot/modules/accf_http.ko...done.
Loaded symbols for /boot/modules/accf_http.ko
Reading symbols from /boot/modules/nvidia.ko...done.
Loaded symbols for /boot/modules/nvidia.ko
Reading symbols from /boot/modules/vboxdrv.ko...done.
Loaded symbols for /boot/modules/vboxdrv.ko
Reading symbols from /boot/modules/mmc.ko...done.
Loaded symbols for /boot/modules/mmc.ko
Reading symbols from /boot/modules/mmcsd.ko...done.
Loaded symbols for /boot/modules/mmcsd.ko
Reading symbols from /boot/modules/sdhci.ko...done.
Loaded symbols for /boot/modules/sdhci.ko
Reading symbols from /boot/modules/h_ertt.ko...done.
Loaded symbols for /boot/modules/h_ertt.ko
Reading symbols from /boot/modules/cc_chd.ko...done.
Loaded symbols for /boot/modules/cc_chd.ko
Reading symbols from /boot/modules/geom_sched.ko...done.
Loaded symbols for /boot/modules/geom_sched.ko
Reading symbols from /boot/modules/gsched_rr.ko...done.
Loaded symbols for /boot/modules/gsched_rr.ko
Reading symbols from /boot/modules/vboxnetflt.ko...done.
Loaded symbols for /boot/modules/vboxnetflt.ko
Reading symbols from /boot/modules/vboxnetadp.ko...done.
Loaded symbols for /boot/modules/vboxnetadp.ko
Reading symbols from /boot/modules/nullfs.ko...done.
Loaded symbols for /boot/modules/nullfs.ko
Reading symbols from /usr/local/modules/rtc.ko...done.
Loaded symbols for /usr/local/modules/rtc.ko
#0  doadump (textdump=1) at /data2/src/sys/kern/kern_shutdown.c:298
298             dumptid = curthread->td_tid;
(kgdb) bt
#0  doadump (textdump=1) at /data2/src/sys/kern/kern_shutdown.c:298
#1  0xffffffff807a0828 in kern_reboot (howto=260) at /data2/src/sys/kern/kern_shutdown.c:366
#2  0xffffffff807a125f in vpanic (fmt=0xffffffff80cf5311 "%s", ap=0xfffffe0239542a20)
    at /data2/src/sys/kern/kern_shutdown.c:759
#3  0xffffffff807a12d0 in panic (fmt=0xffffffff80cf5311 "%s") at /data2/src/sys/kern/kern_shutdown.c:690
#4  0xffffffff80c06a0a in trap_fatal (frame=0xfffffe0239542e00, eva=0) at /data2/src/sys/amd64/amd64/trap.c:801
#5  0xffffffff80c0604c in trap (frame=0xfffffe0239542e00) at /data2/src/sys/amd64/amd64/trap.c:549
#6  0xffffffff80c07085 in trap_check (frame=0xfffffe0239542e00) at /data2/src/sys/amd64/amd64/trap.c:602
#7  0xffffffff80bdeba3 in calltrap () at /data2/src/sys/amd64/amd64/exception.S:236
#8  0xffffffff8097f249 in ng_address_hook (here=0x0, item=0xfffff8011ddd1f00, hook=0xfffff801f8232300, retaddr=0)
    at /data2/src/sys/netgraph/ng_base.c:3586
#9  0xffffffff80986548 in ng_iface_send (ifp=0xfffff801f8ef2800, m=0xfffff801f8526100, sa=2 '\002')
    at /data2/src/sys/netgraph/ng_iface.c:451
#10 0xffffffff80985c97 in ng_iface_output (ifp=0xfffff801f8ef2800, m=0xfffff801f8526100, dst=0xfffff8011db98720, 
    ro=0xfffff8011db98700) at /data2/src/sys/netgraph/ng_iface.c:386
#11 0xffffffff809cae14 in ip_output (m=0xfffff801f8526100, opt=0x0, ro=0xfffff8011db98700, flags=0, imo=0x0, 
    inp=0xfffff8011db98570) at /data2/src/sys/netinet/ip_output.c:655
#12 0xffffffff809e08d2 in tcp_output (tp=0xfffff801f8258410) at /data2/src/sys/netinet/tcp_output.c:1446
#13 0xffffffff809f5f4e in tcp_disconnect (tp=0xfffff801f8258410) at /data2/src/sys/netinet/tcp_usrreq.c:1946
#14 0xffffffff809f29a6 in tcp_usr_disconnect (so=0xfffff8011d8de360) at /data2/src/sys/netinet/tcp_usrreq.c:674
#15 0xffffffff80884072 in sodisconnect (so=0xfffff8011d8de360) at /data2/src/sys/kern/uipc_socket.c:1051
#16 0xffffffff808839b5 in soclose (so=0xfffff8011d8de360) at /data2/src/sys/kern/uipc_socket.c:869
#17 0xffffffff8084d67d in soo_close (fp=0xfffff8011df23b40, td=0xfffff801f8c5d000) at /data2/src/sys/kern/sys_socket.c:334
#18 0xffffffff8072bee1 in fo_close (fp=0xfffff8011df23b40, td=0xfffff801f8c5d000) at file.h:346
#19 0xffffffff80726b86 in _fdrop (fp=0xfffff8011df23b40, td=0xfffff801f8c5d000) at /data2/src/sys/kern/kern_descrip.c:2849
#20 0xffffffff8072b1f7 in closef (fp=0xfffff8011df23b40, td=0xfffff801f8c5d000) at /data2/src/sys/kern/kern_descrip.c:2430
#21 0xffffffff8072768e in closefp (fdp=0xfffff80007104000, fd=6, fp=0xfffff8011df23b40, td=0xfffff801f8c5d000, holdleaders=0)
    at /data2/src/sys/kern/kern_descrip.c:1191
#22 0xffffffff80728417 in kern_close (td=0xfffff801f8c5d000, fd=6) at /data2/src/sys/kern/kern_descrip.c:1239
#23 0xffffffff8072831f in sys_close (td=0xfffff801f8c5d000, uap=0xfffffe0239543b58)
    at /data2/src/sys/kern/kern_descrip.c:1218
#24 0xffffffff80c07b7f in syscallenter (td=0xfffff801f8c5d000, sa=0xfffffe0239543b48) at subr_syscall.c:135
#25 0xffffffff80c0741a in amd64_syscall (td=0xfffff801f8c5d000, traced=0) at /data2/src/sys/amd64/amd64/trap.c:902
#26 0xffffffff80bdee8b in Xfast_syscall () at /data2/src/sys/amd64/amd64/exception.S:396
#27 0x0000000801a4033a in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language:  auto; currently minimal
(kgdb) frame 8
#8  0xffffffff8097f249 in ng_address_hook (here=0x0, item=0xfffff8011ddd1f00, hook=0xfffff801f8232300, retaddr=0)
    at /data2/src/sys/netgraph/ng_base.c:3586
3586                NG_HOOK_NOT_VALID(peer = NG_HOOK_PEER(hook)) ||
(kgdb) l
3581             * that the peer is still connected (even if invalid,) we know
3582             * that the peer node is present, though maybe invalid.
3583             */
3584            TOPOLOGY_RLOCK();
3585            if ((hook == NULL) || NG_HOOK_NOT_VALID(hook) ||
3586                NG_HOOK_NOT_VALID(peer = NG_HOOK_PEER(hook)) ||
3587                NG_NODE_NOT_VALID(peernode = NG_PEER_NODE(hook))) {
3588                    NG_FREE_ITEM(item);
3589                    TRAP_ERROR();
3590                    TOPOLOGY_RUNLOCK();
(kgdb) p *hook
$1 = {
  hk_name = 0xfffff801f8232300 "ÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞP¯\003\201ÿÿÿÿÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­Þ"..., 
  hk_private = 0xdeadc0dedeadc0de, hk_flags = -559038242, hk_type = -559038242, hk_peer = 0xdeadc0dedeadc0de, 
  hk_node = 0xdeadc0dedeadc0de, hk_hooks = {le_next = 0xdeadc0dedeadc0de, le_prev = 0xdeadc0dedeadc0de}, 
  hk_rcvmsg = 0xdeadc0dedeadc0de, hk_rcvdata = 0xdeadc0dedeadc0de, hk_refs = -559038242}
(kgdb) 


More information about the freebsd-net mailing list