cxgb LOR

Matthew Fleming matthew.fleming at isilon.com
Mon Sep 14 21:28:56 UTC 2009


We got a cxgb LOR report of:

1st 0xffffff8001e37be0 vlan_global (vlan_global) @
/build/mnt/src/sys/modules/if_vlan/../../net/if_vlan.c:1310
 2nd 0xffffff80010892f0 cxgb port lock (cxgb port lock) @
/build/mnt/src/sys/modules/cxgb/../../dev/cxgb/cxgb_main.c:1956
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
witness_checkorder() at witness_checkorder+0x9e2
_sx_xlock() at _sx_xlock+0x55
cxgb_ioctl() at cxgb_ioctl+0x1e8
vlan_ioctl() at vlan_ioctl+0x359
ifhwioctl() at ifhwioctl+0xb1
ifioctl() at ifioctl+0xb1
kern_ioctl() at kern_ioctl+0xa3
ioctl() at ioctl+0xf1
freebsd32_ioctl() at freebsd32_ioctl+0x13e
isi_syscall() at isi_syscall+0x94
ia32_syscall() at ia32_syscall+0x1a3
Xint0x80_syscall() at Xint0x80_syscall+0x60
--- syscall (54, FreeBSD ELF32, freebsd32_ioctl), rip = 0x2868db1b, rsp
= 0xffffd4bc, rbp = 0xffffda38 ---


So we tried changing cxgb to not USE_SX.  This resulted in a different
LOR:

lock order reversal: (sleepable after non-sleepable)
 1st 0xffffff8000f9d508 cxgb controller lock 0 (cxgb controller lock 0)
@
/build/mnt/src/sys/modules/cxgb/cxgb/../../../dev/cxgb/cxgb_main.c:1889
 2nd 0xffffffff806064e0 ACPI root bus (ACPI root bus) @
/build/mnt/src/sys/dev/acpica/acpi.c:1040
KDB: stack backtrace:
[ffffffff8018e9fa] db_trace_self_wrapper+0x2a
[ffffffff80298e89] witness_checkorder+0x719
[ffffffff8025bf75] _sx_xlock+0x55
[ffffffff8019678a] acpi_alloc_resource+0x9a
[ffffffff80281714] resource_list_alloc+0x184
[ffffffff801d9f98] pci_alloc_resource+0x158
[ffffffff802814b9] bus_alloc_resource+0x89
[ffffffff81804201] cxgb_setup_interrupts+0x51
[ffffffff81807f33] cxgb_up+0xa3
[ffffffff818083c0] cxgb_init_locked+0x1b0
[ffffffff81808539] cxgb_init+0x39
[ffffffff81808758] cxgb_ioctl+0x1f8
[ffffffff8031e9e1] ifhwioctl+0xb1
[ffffffff8031f720] ifioctl+0xb0
[ffffffff8029a873] kern_ioctl+0xa3
[ffffffff8029aad1] ioctl+0xf1
[ffffffff8041eb93] freebsd32_ioctl+0xb3
[ffffffff8025d963] isi_syscall+0x83
[ffffffff8041de63] ia32_syscall+0x1a3
[ffffffff803efc60] Xint0x80_syscall+0x60
--- syscall (54, FreeBSD ELF32, freebsd32_ioctl), rip = 0x2826ea67, rsp
= 0xffffd8ac, rbp = 0xffffd928 ---

(we modified cxgb_ioctl to call cxgb_init because otherwise the cxgb
interface would require an ifconfig up before it detected a link, which
was different behaviour from the em driver.  Since the locks in question
are acquired inside cxgb_init() I don't think the rest of the stack is
relevant, but network stack isn't my area of expertise).

So it seems that with cxgb we're damned if we do, damned if we don't.
Any advice on which LOR is "worse" or if one is harmless, or how to make
it go away?

Note also that if cxgb uses a mtx then it will do malloc while holding
the mtx in this stack:

[ffffffff803cc58a] uma_zalloc_arg+0x2da
[ffffffff80241ef9] malloc+0x89
[ffffffff8023115f] intr_event_add_handler+0x5f
[ffffffff803f26d2] intr_add_handler+0x72
[ffffffff801dc171] pci_setup_intr+0x41
[ffffffff801dc171] pci_setup_intr+0x41
[ffffffff802807e6] bus_setup_intr+0x96
[ffffffff8180423c] cxgb_setup_interrupts+0x8c
[ffffffff81807f33] cxgb_up+0xa3
[ffffffff818083c0] cxgb_init_locked+0x1b0
[ffffffff81808539] cxgb_init+0x39

Thanks,
matthew


More information about the freebsd-stable mailing list