em network issues
Scott Long
scottl at samsco.org
Fri Oct 20 00:35:18 UTC 2006
Bruce Evans wrote:
> On Thu, 19 Oct 2006, John Polstra wrote:
>
>> On 19-Oct-2006 Scott Long wrote:
>>> The performance measurements that Andre and I did early this year showed
>>> that the INTR_FAST handler provided a very large benefit.
>>
>> I'm trying to understand why that's the case. Is it because an
>> INTR_FAST interrupt doesn't have to be masked and unmasked in the
>> APIC? I can't see any other reason for much of a performance
>> difference in that driver. With or without INTR_FAST, you've got
>> the bulk of the work being done in a background thread -- either the
>> ithread or the taskqueue thread. It's not clear to me that it's any
>> cheaper to run a task than it is to run an ithread.
>
> It's very unlikely to be because masking in the APIC is slow. The
> APIC is fast compared with the PIC, and even with the PIC it takes a
> very high interrupt rate (say 20 KHz) for the PIC overhead to become
> noticeable (say 5-10%), Such interrupt rates may occur, but if they
> do you've probably already lost.
>
> Previously I said that the difference might be due to interrupts
> coalescing but that I wouldn't expect that to happen. Now I see how
> it can happen on loaded systems: the system might be so loaded that
> it often doesn't get around to running the task before a new device
> interrupt would occur if device interrupts weren't turned off. The
> scheduling of the task might accidentally be best or good enough. A
> task might work better than a software ithread accidentally because
> it has lower priority, and similarly, a software ithread might work
> better than a hardware ithread. The lower-priority threads can also
> be preempted, at least with PREEMPTION configured. This is bad for
> them but good for whatever preempts them. Apart from this, it's _more_
> expensive to run a task plus an interrupt handler (even if the interrupt
> handler is fast) than to run a single interrupt handler, and more
> expensive to switch between the handlers, and more expensive again if
> PREEMPTION actually has much effect -- then more switches occur.
>
That's all fine and good, but the em task thread runs at the same
priority as a PI_NET ithread. The whole taskqueue thing was just a
prototype for getting to ifilters. I've demonstrated positive results
with it for aac, em, and mpt drivers.
Scott
>> A difference might show up if you had two or more em devices sharing
>> the same IRQ. Then they'd share one ithread, but would each get their
>> own taskqueue thread. But sharing an IRQ among multiple gigabit NICs
>> would be avoided by anyone who cared about performance, so it's not a
>> very interesting case. Besides, when you first committed this
>> stuff, INTR_FAST interrupts were not sharable.
>
> Sharing an IRQ among a single gigabit NIC and other slower devices is
> even less interesting :-).
>
> It can be hard to measure performance, especially when there are a lot
> of threads or a lot of fast interrupts handlers. If the performance
> benefits are due to accidental scheduling then they might vanish under
> different loads.
>
It's easy to measure performance when you have a Smartbits. More kpps
means more kpps. Thanks again to Andre for making this resource
available.
Scott
More information about the freebsd-net
mailing list