LOR: "taskqueue_drain with the following non-sleepable locks held" with if_em
    Xin Li 
    delphij at delphij.net
       
    Wed May  8 17:47:06 UTC 2013
    
    
  
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
On 05/07/13 21:55, Garrett Cooper wrote:
> On Tue, May 7, 2013 at 4:06 PM, Xin Li <delphij at delphij.net>
> wrote:
>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512
>> 
>> On 05/07/13 15:03, Garrett Cooper wrote:
>>> Saw the following LOR on a CURRENT build as of yesterday with
>>> an almost idle machine processing ARP requests:
>>> 
>>> root at wf220:/mnt # taskqueue_drain with the following
>>> non-sleepable locks held: exclusive rw lle (lle) r = 0
>>> (0xfffffe001450b410) locked @ /usr/src/sys/netinet/in.c:1484
>>> KDB: stack backtrace: db_trace_self_wrapper() at
>>> db_trace_self_wrapper+0x2b/frame 0xffffff848d4f7690
>>> kdb_backtrace() at kdb_backtrace+0x39/frame 0xffffff848d4f7740
>>> witness_warn() at witness_warn+0x4a8/frame 0xffffff848d4f7800
>>> taskqueue_drain() at taskqueue_drain+0x3a/frame 
>>> 0xffffff848d4f7840 set_timeout() at set_timeout+0x4a/frame 
>>> 0xffffff848d4f7860 netevent_callback() at 
>>> netevent_callback+0x16/frame 0xffffff848d4f7870 arpintr() at 
>>> arpintr+0x9b5/frame 0xffffff848d4f7930 netisr_dispatch_src()
>>> at netisr_dispatch_src+0x60/frame 0xffffff848d4f79a0
>>> ether_demux() at ether_demux+0x130/frame 0xffffff848d4f79d0
>>> ether_nh_input() at ether_nh_input+0x369/frame
>>> 0xffffff848d4f7a30 netisr_dispatch_src() at
>>> netisr_dispatch_src+0x60/frame 0xffffff848d4f7aa0 em_rxeof()
>>> at em_rxeof+0x30e/frame 0xffffff848d4f7b10 em_msix_rx() at 
>>> em_msix_rx+0x33/frame 0xffffff848d4f7b40 
>>> intr_event_execute_handlers() at 
>>> intr_event_execute_handlers+0x80/frame 0xffffff848d4f7b70 
>>> ithread_loop() at ithread_loop+0x128/frame 0xffffff848d4f7bb0 
>>> fork_exit() at fork_exit+0x71/frame 0xffffff848d4f7bf0 
>>> fork_trampoline() at fork_trampoline+0xe/frame
>>> 0xffffff848d4f7bf0 --- trap 0, rip = 0, rsp =
>>> 0xffffff848d4f7cb0, rbp = 0 --- root at wf220:/mnt # uname -a
>>> FreeBSD wf220.west.isilon.com 10.0-CURRENT FreeBSD 10.0-CURRENT
>>> #1: Tue May  7 08:04:59 PDT 2013 
>>> root at wf220.west.isilon.com:/usr/obj/usr/src/sys/ISI-GENERIC
>>> amd64
>>> 
>>> I've seen this issue before for a few weeks/months, so it's
>>> nothing new (but probably should be fixed...). Thanks!
>> 
>> This have nothing to do with em(4) but looks like a bug in our
>> Linux compatibility wrapper.  In the InfiniBand code, its 
>> _handle_arp_update_event() calls netevent_callback() with 
>> NETEVENT_NEIGH_UPDATE, where a cancel_delayed_work() causes the
>> drain.
>> 
>> Looking at the Linux code, it seems that we just shouldn't do
>> the drain in the cancel_delayed_work() wrapper 
>> (sys/ofed/include/linux/workqueue.h) so it seems like we need 
>> something like this:
>> 
>> Index: sys/ofed/include/linux/workqueue.h 
>> ===================================================================
>>
>> 
- - --- sys/ofed/include/linux/workqueue.h        (revision 250337)
>> +++ sys/ofed/include/linux/workqueue.h  (working copy) @@ -184,9
>> +184,9 @@ {
>> 
>> callout_stop(&work->timer); - -     if (work->work.taskqueue && -
>> -         taskqueue_cancel(work->work.taskqueue,
>> &work->work.work_task, NULL)) - -
>> taskqueue_drain(work->work.taskqueue, &work->work.work_task); +
>> if (work->work.taskqueue) +               return
>> (taskqueue_cancel(work->work.taskqueue, +
>> &work->work.work_task, NULL) != 0); return 0; }
>> 
>> 
>> 
>> I've added Jeff to Cc.
> 
> The patch LGTM (I haven't hit the issue after 10 minutes of use; 
> generally it pops up almost immediately after boot or within the
> first couple of minutes).
Committed as r250374.  (The return value is inverted in this version
and I committed what I believed was correct, based on my reading of
Linux documentation.  The return value does not affect your test
result though, as it's discarded anyway.)
Cheers,
- -- 
Xin LI <delphij at delphij.net>    https://www.delphij.net/
FreeBSD - The Power to Serve!           Live free or die
-----BEGIN PGP SIGNATURE-----
iQEcBAEBCgAGBQJRio+SAAoJEG80Jeu8UPuzvCUH+QHAXi3UCqyoBfUsNTkHofmB
riKFONZem5QsR425tg1qPcYwpgcQKAaZpu6a5ILsWZ2IPliN3QysrXFDmkVsL53/
yYK4Pcpa9TA11EjyHj3Bt1hnUqRldz5Olwhpb+RExAWaBZ0Nczf26H2GDOZvEXB4
99OXYje7bR1mbZOUoPkVcDqr4Mh0EZDHct5SxQv3eMagble5iaEiVkvunS0/P3nk
njpFbODbfMM9qs3QVxvukp3rA9M7E5cbyhl0WNDHs5h192kvy+rh5C4w3LYi+Vx9
Wlpjy9t1kxA8bLi2d0fyLqsigo2Yz6BHAwB9zs9nQ02Mg3wOPsBIIkr4y1DFiOY=
=uH91
-----END PGP SIGNATURE-----
    
    
More information about the freebsd-net
mailing list