two problems in dev/e1000/if_lem.c::lem_handle_rxtx()

Barney Cordoba barney_cordoba at yahoo.com
Fri Jan 18 14:30:48 UTC 2013



--- On Thu, 1/17/13, Adrian Chadd <adrian at freebsd.org> wrote:

> From: Adrian Chadd <adrian at freebsd.org>
> Subject: Re: two problems in dev/e1000/if_lem.c::lem_handle_rxtx()
> To: "Barney Cordoba" <barney_cordoba at yahoo.com>
> Cc: "Luigi Rizzo" <rizzo at iet.unipi.it>, freebsd-net at freebsd.org
> Date: Thursday, January 17, 2013, 11:48 AM
> There's also the subtle race
> condition in TX and RX handling that
> re-queuing the taskqueue gets around.
> 
> Which is:
> 
> * The hardware is constantly receiving frames , right until
> you blow
> the FIFO away by filling it up;
> * The RX thread receives a bunch of frames;
> * .. and processes them;
> * .. once it's done processing, the hardware may have read
> some more
> frames in the meantime;
> * .. and the hardware may have generated a mitigated
> interrupt which
> you're ignoring, since you're processing frames;
> * So if your architecture isn't 100% paranoid, you may end
> up having
> to wait for the next interrupt to handle what's currently in
> the
> queue.
> 
> Now if things are done correct:
> 
> * The hardware generates a mitigated interrupt
> * The mask register has that bit disabled, so you don't end
> up receiving it;
> * You finish your RX queue processing, and there's more
> stuff that's
> appeared in the FIFO (hence why the hardware has generated
> another
> mitigated interrupt);
> * You unmask the interrupt;
> * .. and the hardware immediately sends you the MSI or
> signals an interrupt;
> * .. thus you re-enter the RX processing thread almost(!)
> immediately.
> 
> However as the poster(s) have said, the interrupt
> mask/unmask in the
> intel driver(s) may not be 100% correct, so you're going to
> end up
> with situations where interrupts are missed.
> 
> The reason why this wasn't a big deal in the deep/distant
> past is
> because we didn't used to have kernel preemption, or
> multiple kernel
> threads running, or an overly aggressive scheduler trying
> to
> parallelise things as much as possible. A lot of
> net80211/ath bugs
> have popped out of the woodwork specifically because of the
> above
> changes to the kernel. They were bugs before, but people
> didn't hit
> them.
> 

I don't see the distinction between the rx thread getting re-scheduled
"immediately" vs introducing another thread. In fact you increase missed
interrupts by this method. The entire point of interrupt moderation is
to tune the intervals where a driver is processed.

You might as well just not have a work limit and process until your done.
The idea that "gee, I've been taking up too much cpu, I'd better yield"
to just queue a task and continue soon after doesn't make much sense to 
me.

BC


More information about the freebsd-net mailing list