rtentry panic with FIB
Julian Elischer
julian at elischer.org
Sat Aug 30 14:57:55 UTC 2008
Robert Watson wrote:
>
> On Fri, 29 Aug 2008, John Baldwin wrote:
>
>> Unfortunately it hung trying to dump, so all I have is the stack trace
>> from DDB. This is recent HEAD running stress2
>>
>> panic: _mtx_lock_sleep: recursed on non-recursive mutex rtentry @ ../../1
>
> Kip and I have theorized that increased parallelism at higher layers of
> the network stack is exposing route locking and reference counting to
> more stress than it had done previously, and that as such we're starting
> to trigger races in the routing code more than we used to. While I
> wouldn't rule out a FIB-related bug, it seems more likely to me that
> we've hit a general bug in locking/references in the ethernet link layer
> / ARP, and we need to take a careful look at what's going on throughout
> that layer.
>
> Unfortunately, that's not something I have time to work on currently, so
> it would be great if people with an existing interest in the routing
> code (Julian and Qing have done the most work there recently?) could
> spend a few hours looking really carefully at what is happening.
I'm planning on spending few hours on looking at this this weekend..
>
> Robert N M Watson
> Computer Laboratory
> University of Cambridge
>
>>
>> cpuid = 1
>> KDB: enter: panic
>> [thread pid 14025 tid 100928 ]
>> Stopped at kdb_enter+0x3d: movq $0,0x435054(%rip)
>> db> tr
>> Tracing pid 14025 tid 100928 td 0xffffff0003773360
>> kdb_enter() at kdb_enter+0x3d
>> panic() at panic+0x14b
>> _mtx_lock_flags() at _mtx_lock_flags
>> _mtx_lock_flags() at _mtx_lock_flags+0xc3
>> rt_check_fib() at rt_check_fib+0x1ea
>> arpresolve() at arpresolve+0x77
>> ether_output() at ether_output+0x180
>> ip_output() at ip_output+0xb4f
>> udp_send() at udp_send+0x47d
>> sosend_dgram() at sosend_dgram+0x1fa
>> soo_write() at soo_write+0x30
>> dofilewrite() at dofilewrite+0x7a
>> kern_writev() at kern_writev+0x52
>> write() at write+0x4d
>> syscall() at syscall+0x1bf
>> Xfast_syscall() at Xfast_syscall+0xab
>> --- syscall (4, FreeBSD ELF64, write), rip = 0x80071cb7c, rsp =
>> 0x7fffffffe628,-
>> db> c
>> Uptime: 1h39m18s
>> Physical memory: 2038 MB
>> Dumping 263 MB:pid 14025 (udp), uid 26840, was killed: exceeded
>> maximum CPU
>> limt
>> pid 14099 (udp), uid 26840, was killed: exceeded maximum CPU limit
>> pid 14100 (udp), uid 26840, was killed: exceeded maximum CPU limit
>>
>> --
>> John Baldwin
>> _______________________________________________
>> freebsd-current at freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-current
>> To unsubscribe, send any mail to
>> "freebsd-current-unsubscribe at freebsd.org"
>>
> _______________________________________________
> freebsd-current at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe at freebsd.org"
More information about the freebsd-current
mailing list