em net (optical GigE) driver hangs?

Don Bowman don at sandvine.com
Tue Apr 22 14:47:44 PDT 2003


From: John Polstra [mailto:jdp at polstra.com]
> Sent: April 22, 2003 16:12
> To: net at freebsd.org
> Subject: Re: em net (optical GigE) driver hangs?
> 
> 
> In article 
> <FE045D4D9F7AED4CBFF1B3B813C8533701918A83 at mail.sandvine.com>,
> Dave Dolson  <ddolson at sandvine.com> wrote:
> > 
> > Has anyone experienced em interface hangs after approx 
> several days of heavy
> > operation?
> > 
> > We are using a system which is mostly RELENG_4_7, using 
> multiple optical em
> > GigE devices.
> > 
> > The symptom is that the interface stops transmitting or 
> receiving, reporting
> > drops on output (no tx descriptors) and input errors (MPC 
> stat-->no receive
> > descriptors).
> > 
> > It turns out that all but 64 transmit descriptors are in 
> use.  The driver is
> > waiting for the "done" flag to be set so it can clean the 
> descriptors.
> > The device is also in the OACTIVE state at this time.
> > 
> > After the interface is brought down (or unplugged), the em 
> watchdog timer
> > goes off 5s later.
> > 
> > We are trying to figure out two things:
> > 1. why did the driver lock up?
> > 2. why didn't the watchdog timer go off earlier?
> > 
> > I think we would be happy to solve #2 given the rarity of the event.
> > Is the RELENG_4 version likely to fix the problem?
> 
> I think the RELENG_4 version is likely to eliminate the problem.  See
> the comment near the define of EM_RDTR in if_em.h (in the RELENG_4
> version of that file, of course).

We saw that, but we are using DEVICE_POLLING, so assumed it was not
the issue. We think instead its another problem, which is also solved
in the RELENG_4 driver, in that em_poll() calls em_start() if device is 
running and there are pkts on the queue. em_start() re-arms the timer, 
holding off the wdog forever.

--don


More information about the freebsd-net mailing list