Possible UDP related deadlock in 7.1-PRERELEASE
Norbert Papke
fbsd-ml at scrapper.ca
Tue Sep 16 01:14:01 UTC 2008
On September 15, 2008, Gavin Atkinson wrote:
> On Sun, 2008-09-14 at 12:19 -0700, Norbert Papke wrote:
> > Symptoms:
> >
> > * I can trigger this lockup reliably by starting ktorrent. After a short
> > while (one to two minutes), it locks up. Other commands, e.g., netstat,
> > also lock up.
> > * The console generates "nfe0: watchdog timeout" error messages.
> > * The system becomes unusable and must be rebooted.
> >
> > Attempted Diagnosis:
> >
> > If I break into DDB, the 'ps' output shows a number of processes that
> > seem to be locked related to udp.
> >
> > [irq18:dc0] L *udp
> > ktorrent L *udpinp
> > hald L *udp
> > ntpd L *udp
> >
> > Unfortunately, I am rapidly getting out of my depth here. I have no idea
> > how to go about further analyzing this problem and would appreciate help.
>
> Can you add:
> options WITNESS
> options WITNESS_SKIPSPIN
>
> to your kernel, recompile and wait for the problem to happen again?
> When it does, from the debugger issue "sh alllocks" and make a note of
> the output?
With WITNESS enabled, I now experience panics and could not follow your
instructions. There is no core dump. The following gets logged
to /var/log/messages:
shared lock of (rw) udpinp @ /usr/src/sys/netinet/udp_usrreq.c:864
while exclusively locked from /usr/src/sys/netinet6/udp6_usrreq.c:940
panic: share->excl
KDB: stack backtrace:
db_trace_self_wrapper(c06fda7c,f6b96978,c052046a,c06fbb5d,c07695c0,...) at
db_trace_self_wrapper+0x26
kdb_backtrace(c06fbb5d,c07695c0,c06febd1,f6b96984,f6b96984,...) at
kdb_backtrace+0x29
panic(c06febd1,c070c409,3ac,c0709eee,360,...) at panic+0xaa
witness_checkorder(ccd5209c,1,c0709eee,360,8,...) at witness_checkorder+0x17c
_rw_rlock(ccd5209c,c0709eee,360,c07780e0,cd4652c8,...) at _rw_rlock+0x2a
udp_send(d3942000,0,c580f400,c68faa00,0,...) at udp_send+0x197
udp6_send(d3942000,0,c580f400,c68faa00,0,...) at udp6_send+0x140
sosend_generic(d3942000,c68faa00,f6b96be8,0,0,...) at sosend_generic+0x50d
sosend(d3942000,c68faa00,f6b96be8,0,0,...) at sosend+0x3f
kern_sendit(cd465230,f,f6b96c64,0,0,...) at kern_sendit+0x106
sendit(0,871b9fe,0,c68faa00,1c,...) at sendit+0x182
sendto(cd465230,f6b96cfc,18,cd465230,c072bab8,...) at sendto+0x4f
syscall(f6b96d38) at syscall+0x293
Note that I do not use IPv6, none of my network interfaces is configured for
it.
Also, since I enabled WITNESS, I get the following logged during system
startup:
Enabling pf.
lock order reversal:
1st 0xc09af92c pf task mtx (pf task mtx)
@ /usr/src/sys/modules/pf/../../contri
b/pf/net/pf_ioctl.c:1394
2nd 0xc07b4d68 ifnet (ifnet) @ /usr/src/sys/net/if.c:1558
KDB: stack backtrace:
db_trace_self_wrapper(c06fda7c,f4914a60,c0552c75,c06fed11,c07b4d68,...) at
db_tr
ace_self_wrapper+0x26
kdb_backtrace(c06fed11,c07b4d68,c0703ca2,c0703ca2,c0703c73,...) at
kdb_backtrace
+0x29
witness_checkorder(c07b4d68,9,c0703c73,616,572,...) at
witness_checkorder+0x5e5
_mtx_lock_flags(c07b4d68,0,c0703c73,616,c0104414,...) at _mtx_lock_flags+0x34
ifunit(c6ef5c20,0,c09adfb5,572,c0703a71,...) at ifunit+0x2f
pfioctl(c566ce00,c0104414,c6ef5c20,3,c60c38c0,...) at pfioctl+0x2b43
devfs_ioctl_f(c588bb94,c0104414,c6ef5c20,c54bb900,c60c38c0,...) at
devfs_ioctl_f
+0xe6
kern_ioctl(c60c38c0,3,c0104414,c6ef5c20,1000000,...) at kern_ioctl+0x243
ioctl(c60c38c0,f4914cfc,c,c0718d59,c072b350,...) at ioctl+0x134
syscall(f4914d38) at syscall+0x293
Xint0x80_syscall() at Xint0x80_syscall+0x20
--- syscall (54, FreeBSD ELF32, ioctl), eip = 0x281ab6f3, esp = 0xbfbfde3c,
ebp
= 0xbfbfde68 ---
pf enabled
I tried to unload 'pf' to see if it was the culprit. However, even without pf
loaded, I experience the panic.
Is there anything else I can try to provide better insight into what might be
going on?
Cheers,
-- Norbert.
More information about the freebsd-stable
mailing list