kern/172113: [panic] [e1000] [patch] 9.1-RC1/amd64 panices in igb(4): m_getjcl: invalid cluster type
John Baldwin
jhb at freebsd.org
Tue Jan 22 17:20:01 UTC 2013
The following reply was made to PR kern/172113; it has been noted by GNATS.
From: John Baldwin <jhb at freebsd.org>
To: Jack Vogel <jfvogel at gmail.com>
Cc: "George Neville-Neil" <gnn at freebsd.org>,
bug-followup at freebsd.org,
egrosbein at rdtc.ru,
jfv at freebsd.org
Subject: Re: kern/172113: [panic] [e1000] [patch] 9.1-RC1/amd64 panices in igb(4): m_getjcl: invalid cluster type
Date: Tue, 22 Jan 2013 12:09:32 -0500
On Monday, January 21, 2013 3:28:40 pm Jack Vogel wrote:
> Well, do you have a more complete designation of the motherboard? We can
> look into it, although if the one check stops the problem it may be a low
> priority.
It is a SuperMicro X8DTU-F.
> Jack
>
>
> On Mon, Jan 21, 2013 at 11:25 AM, George Neville-Neil <gnn at freebsd.org>wrote:
>
> >
> > On Jan 19, 2013, at 23:26 , John Baldwin <jhb at FreeBSD.org> wrote:
> >
> > > I was able to finally reproduce this panic today. It seems to require
> > > a server configured for PXE but that receives no DHCP reply (and
> > > possibly with the requisite SuperMicro X8 board). I was able to
> > > prevent the panic with a subset of the referenced patch by only adding
> > > the 'if_drv_flags & IFF_DRV_RUNNING' check to the start of
> > > igb_msix_que(). The rest of the patch was unnecessary. I also added
> > > some debugging to print out the ICR, EICR, IMS, and EIMS registers in
> > > this case. It does look like the hardware is sending an interrupt that
> > > is not enabled in the interrupt mask (specifically LSC). In fact, the
> > > 82576 datasheet specifically mentions masking LSC until initialization
> > > is complete to avoid spurious interrupts during boot and AFAICT igb(4)
> > > does this since e1000_reset_hw() clears the interrupt mask via writes
> > > to IMC and doesn't re-enable interrupts until igb_init_locked() is
> > > invoked via 'ifconfig up'. Here is my debug output:
> > >
> > > SMP: AP CPU #6 Launched!
> > > SMP: AP CPU #4 Launched!
> > > stray irq0
> > > igb0: interrupt on que 0: icr 0x1000004 eicr 0
> > > ims 0 eims 0x80000000
> > >
> > > Hmmm. Nothing clears EIMS. After some more debugging, I determined
> > > that e1000_reset_hw() always turns this bit in EIMS on, even if it is
> > > off before e1000_reset_hw() is called(!). I added explicit calls to
> > > igb_disable_intr() to clear EIMS after each call to e1000_reset_hw().
> > > This removes the 'stray irq0', but I still get a spurious interrupt
> > > during boot (albeit with eims 0). I can use the IFF_DRV_RUNNING hack
> > > for now, but I think the real fix is something else.
> > >
> >
> > I think Jack will have to chime in on this one. Do you think it's all SM
> > X8 boards
> > or just the one we happen to have? I wonder if Jack or Jeffrey (the
> > testing guy he works
> > with) have access to the right board.
> >
> > Best,
> > George
> >
> >
> >
>
--
John Baldwin
More information about the freebsd-net
mailing list