em stability issues + panic
Pyun YongHyeon
pyunyh at gmail.com
Mon Oct 2 00:53:41 PDT 2006
On Mon, Oct 02, 2006 at 12:26:34AM -0700, John-Mark Gurney wrote:
> Well, I will admit I have a bit older if_em.c, v1.147, but I haven't
> been doing much w/ my em, probably not even passing close to 100mbit
> of traffic (in gige mode)... I recently obtained a crash dump from
> em_txeof where the tx_buffer is NULL at line 2958:
> 2958 if (tx_buffer->m_head) {
>
> If any one want some additional data, I can provide info from the
> crash dump... Just as a bit of trivia, I did load a few kld's..
> bktr.ko, bktrau.ko and iic.ko (plus respective other kld's that got
> auto loaded)... It also seems that interactiveness is more likely
> to hang em than other traffic... I've been running the box as a nfs
> server for a while w/o issues, but I log in and run ffmpeg, and it
> almost immediately hangs requiring an down/up to bring back the
> interface...
>
> The panic was when I was bringing the interface back up... Though when
> I it paniced, I had down/up'd the interface a few times w/o success in
> bringing it back...
>
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; apic id = 00
> fault virtual address = 0x0
> fault code = supervisor read, page not present
> instruction pointer = 0x20:0xc047155e
> stack pointer = 0x28:0xe1d1cc50
> frame pointer = 0x28:0xe1d1cc64
> code segment = base 0x0, limit 0xfffff, type 0x1b
> = DPL 0, pres 1, def32 1, gran 1
> processor eflags = interrupt enabled, resume, IOPL = 0
> current process = 12 (swi4: clock sio)
> Physical memory: 999 MB
> Dumping 225 MB: 210 194 178 162 146 130 114 98 82 66 50 34 18 2
>
> #9 0xc066f56a in calltrap () at ../../../i386/i386/exception.s:138
> #10 0xc047155e in em_txeof (adapter=0xc34df800) at ../../../dev/em/if_em.c:2956
> #11 0xc046e502 in em_watchdog (ifp=0xc3502400) at ../../../dev/em/if_em.c:963
> #12 0xc0585b22 in if_slowtimo (arg=0x0) at ../../../net/if.c:1415
> #13 0xc0529fa9 in softclock (dummy=0x0) at ../../../kern/kern_timeout.c:271
> #14 0xc050a57a in ithread_execute_handlers (p=0xc33c38d0, ie=0xc341c580)
> at ../../../kern/kern_intr.c:662
> #15 0xc050a673 in ithread_loop (arg=0xc33a2940)
> at ../../../kern/kern_intr.c:745
> #16 0xc050981b in fork_exit (callout=0xc050a624 <ithread_loop>,
> arg=0xc33a2940, frame=0xe1d1cd38) at ../../../kern/kern_fork.c:818
> #17 0xc066f5cc in fork_trampoline () at ../../../i386/i386/exception.s:199
>
I think bringing the interface down while Rx is active may corrupt
internal hardware state because em_rxeof() runs without driver lock.
See http://lists.freebsd.org/pipermail/freebsd-current/2006-September/066203.html
You may need to protect em_rxeof with dirver lock in em_handle_rxtx().
(Remember dropping driver lock before invoking if_input in em_rxeof.)
--
Regards,
Pyun YongHyeon
More information about the freebsd-current
mailing list