svn commit: r296909 - head/sys/ofed/drivers/infiniband/ulp/ipoib

Hans Petter Selasky hps at selasky.org
Wed Mar 16 12:46:44 UTC 2016


On 03/15/16 18:43, John Baldwin wrote:
> On Tuesday, March 15, 2016 03:47:26 PM Hans Petter Selasky wrote:
>> Author: hselasky
>> Date: Tue Mar 15 15:47:26 2016
>> New Revision: 296909
>> URL: https://svnweb.freebsd.org/changeset/base/296909
>>
>> Log:
>>    Fix witness panic in the ipoib_ioctl() function when unloading the
>>    ipoib module.
>>
>>    The bpfdetach() function is trying to turn off promiscious mode on the
>>    network interface it is attached to while holding a mutex. The fix
>>    consists of ignoring any further calls to the ipoib_ioctl() function
>>    when the network interface is going to be detached. The ipoib_ioctl()
>>    function might sleep.
>>
>>    Sponsored by:	Mellanox Technologies
>>    MFC after:	1 week
>
> In all of the other NIC drivers I have worked on the fix has been to call
> if_detach (or ether_ifdetach) as the first step in your detach method before
> you try to stop the interface.  The idea is to remove external connections
> from your driver first, and the only after those have drained do you actually
> shut down the hardware.  Given you are calling if_detach() just before
> if_free() below, ipoib just seems to be broken here.

Hi John,

It is a problem that if_ioctl() callbacks come from various places 
without proper refcounting and unknown context properties, like sleeping 
allowed or not.

The WITNESS assert I'm trying to solve is this one:

> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe04661876c0
> ipoib_ioctl() at ipoib_ioctl+0x29/frame 0xfffffe0466187700
> if_setflag() at if_setflag+0xd5/frame 0xfffffe0466187760
> ifpromisc() at ifpromisc+0x2c/frame 0xfffffe0466187790
> bpf_detachd_locked() at bpf_detachd_locked+0x1e8/frame 0xfffffe04661877d0
> bpfdetach() at bpfdetach+0x152/frame 0xfffffe0466187810
> ipoib_detach() at ipoib_detach+0x42/frame 0xfffffe0466187830
> ipoib_remove_one() at ipoib_remove_one+0x19e/frame 0xfffffe04661878a0
> ib_unregister_client() at ib_unregister_client+0x5e/frame 0xfffffe04661878e0
> ipoib_cleanup_module() at ipoib_cleanup_module+0x52/frame 0xfffffe04661878f0
> _module_run() at _module_run+0x7b/frame 0xfffffe0466187920
> linker_file_unload() at linker_file_unload+0x45f/frame 0xfffffe0466187970
> kern_kldunload() at kern_kldunload+0xbc/frame 0xfffffe04661879a0
> amd64_syscall() at amd64_syscall+0x2db/frame 0xfffffe0466187ab0
> Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0466187ab0


As you can see bpfdetach() is calling ifpromisc() locked, which 
basically means we cannot do anything there with regard to ipoib.

ipoib is not a NIC driver, but an IP-bearer only.

Comparing with PCI devices is not valid here, because with a PCI device 
you usually only need to write a PCI register to switch promiscious mode 
on/off, but with IPOIB it involves commands towards the infiniband stack 
which might sleep.

> For example, detach in a typical NIC driver does this:
>
> 	struct foo_softc *sc;
>
> 	sc = device_get_softc(dev);
> 	ether_ifdetach(sc->sc_ifp);
> 	FOO_LOCK(sc);
> 	foo_stop(sc);
> 	FOO_UNLOCK(sc);
> 	callout_drain(&sc->timer);
> 	bus_teardown_intr(...);
> 	bus_release_resource(...);
> 	if_free(sc->sc_ifp);
>
> Similarly, devices with a character device in /dev should be calling
> destroy_dev() first before shutting down hardware, etc.

Right. The problem is not really the detach sequence, but that 
bpfdetach() calls ifpromisc() locked. I'm not sure if we can change that.

--HPS



More information about the svn-src-all mailing list