Deterministic panic due to non-sleepable lock with if_alc when
reconfiguring interfaces
John Baldwin
jhb at freebsd.org
Fri Aug 19 12:10:34 UTC 2011
On Friday, August 19, 2011 3:17:12 am Garrett Cooper wrote:
> On Thu, Aug 18, 2011 at 9:31 PM, <mdf at freebsd.org> wrote:
> > On Thu, Aug 18, 2011 at 5:50 PM, Garrett Cooper <yanegomi at gmail.com>
wrote:
> >> When loading if_alc as a module on my netbook and running
> >> /etc/rc.d/netif restart, I can deterministically panic my netbook with
> >> the following message:
>
> These repro steps were overly simplified. The complete steps are:
>
> 1. Attach ethernet cable to alc(4) enabled NIC.
> 2. Boot up machine.
> 3. Login.
> 4. Physically remove ethernet cable from alc(4) enabled NIC.
> 5. Run `/etc/rc.d/netif restart' as root.
>
> >> ) at _bus_dmamap_sync+0x51
> >> alc_stop(c3dbb000,0,c0c51844,93a,80206910,...) at alc_stop+0x24e
> >> alc_ioctl(c3d07400,80206910,c40423c0,c06a7935,c0914e3c,...) at
alc_ioctl+0x22e
> >> ifioctl(c45029c0,80206910,c40423c0,c40505c0,c4528c00,...) at
ifioctl+0xc98
> >> soo_ioctl(c4574e00,80206910,c40423c0,c413e680,c40505c0,...) at
soo_ioctl+0x401
> >> kern_ioctl(c40505c0,3,80206910,c40423c0,c40423c0,...) at kern_ioctl+0x1d7
> >> ioctl(c40505c0,e6ca3cec,e6ca3d28,c08e929d,0,...) at ioctl+0x118
> >> syscallenter(c40505c0,e6ca3ce4,e6ca3ce4,0,0,...) at syscallenter+0x23f
> >> syscall(e6ca3d28) at syscall+0x2e
> >> Xint0x80_syscall() at Xint0x80_syscall+0x21
> >> --- syscall (54kernel trap 12 with interrupts disabled
> >> Kernel page fault with the following non-sleepable locks held:
> >> exclusive sleep mutex alc0 (network driver) r = 0 (0xc3dbc608) locked
> >> @ /usr/src/sys/modules/alc/../../dev/alc/if_alc.c:2362
> >> KDB: stack backtrace:
> >> db_trace_self_wrapper(c08e727a,80,6e726500,74206c65,20706172,...) at
> >> db_trace_self_wrapper+0x26
> >> kdb_backtrace(93a,0,ffffffff,c0ad6114,e6ca323c,...) at kdb_backtrace+0x2a
> >> _witness_debugger(c08e9f67,e6ca3250,4,1,0,...) at _witness_debugger+0x1e
> >> witness_warn(5,0,c0924fe1,c097df50,c3e42b00,...) at witness_warn+0x1f1
> >> trap(e6ca32dc) at trap+0x15a
> >> calltrap() at calltrap+0x6
> >>
> >> I tried to track down what the exact issue was, but I got lost
> >> (the locking sort of looks ok to me, but I'm still not an expert with
> >> mutex(9)).
> >> I still have the vmcore and can provide more helpful details when
requested.
> >
> > The locking itself is almost certainly fine. The error message is not
> > very helpful, but what went wrong was the page fault. You just happen
> > to panic on a witness warning before vm_fault can panic due to a bad
> > address.
> >
> > The alc(4) maintainer would probably like info on the trap (line of
> > code and where the bad pointer came from).
>
> I talked to Xin a bit and as he noted the panic was just a symptom
> of the actual issue at hand. I think the problem is that the rx ring's
> rx_m value isn't set to NULL when an error occurred, but getting to
> the exact problem at hand, the following call is failing:
>
> if (bus_dmamap_load_mbuf_sg(sc->alc_cdata.alc_rx_tag, // <-- HERE
> sc->alc_cdata.alc_rx_sparemap, m, segs, &nsegs, 0) != 0) {
> m_freem(m);
> return (ENOBUFS);
> }
>
> It's failing with ENOMEM. Still trying to determine what the exact
> reason for ENOMEM is from the x86 busdma code though..
ENOMEM The load request has failed due to insufficient
resources, and the caller specifically used the
BUS_DMA_NOWAIT flag.
(bus_dmamap_load_mbuf*() imply BUS_DMA_NOWAIT.)
You couldn't allocate enough bounce pages:
/* Reserve Necessary Bounce Pages */
if (map->pagesneeded != 0) {
mtx_lock(&bounce_lock);
if (flags & BUS_DMA_NOWAIT) {
if (reserve_bounce_pages(dmat, map, 0) != 0) {
mtx_unlock(&bounce_lock);
return (ENOMEM);
}
Of course, now the question is why you even need bounce pages for alc(4):
/* Create DMA tag for Rx buffers. */
error = bus_dma_tag_create(
sc->alc_cdata.alc_buffer_tag, /* parent */
ALC_RX_BUF_ALIGN, 0, /* alignment, boundary */
BUS_SPACE_MAXADDR, /* lowaddr */
BUS_SPACE_MAXADDR, /* highaddr */
NULL, NULL, /* filter, filterarg */
MCLBYTES, /* maxsize */
1, /* nsegments */
MCLBYTES, /* maxsegsize */
0, /* flags */
NULL, NULL, /* lockfunc, lockarg */
&sc->alc_cdata.alc_rx_tag);
It can handle 64-bit DMA just fine, and mbuf clusters used for RX should
always be aligned and never need bounce pages.
--
John Baldwin
More information about the freebsd-current
mailing list