SMPable version of EM driver

Vladimir Ivanov wawa at yandex-team.ru
Wed Oct 3 03:42:33 PDT 2007


Bruce Evans wrote:
> On Tue, 2 Oct 2007, Vladimir Ivanov wrote:
>
>> Main improvement of this version: driver does not use TX interrupts 
>> at all. So, interrupt rate reduced significantly.
>
> Polling for anything is a bug IMO.  Buggy hardware may work better 
> with it,
> but em is not buggy :-).
The driver does not use polling. We've disabled TX interrupts because we 
seem interrupt hook is too strange and ineffective place to make 
watchdog calculations and queue cleaning. We can do it from application 
context much easier.

RX queue procedure uses another technique. We send wakeup message to RX 
kernel threads and mask RX interrupts. Each RX thread parses RX queue 
while it isn't empty.  After completion RX kernel thread unmask 
interrupt. This hint let us avoid both RX interrupt storm and additional 
latency (due to admin's throttling).  RX interrupt is being masked if 
and only if there are no threads to handle interrupt. Also, the driver 
behave itself like polling mode under heavy load.

But the major benefit of our patchset is SMP.
> For bge, I tune the interrupt moderation parameters to reduce the tx
> interrupt rate to almost as low as possible without doing polling.
> The rate is either 1 interrupt per second if the tx is almost inactive
> or 1 interrupt every 384 packets if the tx is active.  -current mistunes
> these parameters to 150 (microseconds) and 10 (descriptos).  Old tuning
> of 150 and 128 only loses a little compared with 1000000 and 384.  (150
> gives 6667 interrupts per second under load.  This interrupt rate is
> quite manageable and is about the same rate as you have to use with
> polling to get the same throughput but lower efficiency as with
> interrupts.  128 for the descriptor limit causes in a max interrupt rate
> of only a few hundred per second except with tiny packets, but 10 is
> excessively small and requires a rate of up to 140000 per second to keep
> up with tiny packets.  140000 isn't manageable.)
>
> em has more/better interrupt parameters with non-broken defaults so I 
> haven't
> needed to tune them.  For bge, I implement dynamic rx interrupt 
> moderation
> in software where em has it in hardware.  10000 interrupts/second for rx
> is a good limit.  IIRC, em uses 8000 which is a bit low for a max, and
> is missing a sysctl for easy tuning.
I've spent a lot of time for em tuning. This way has a limit.
:-)
>
> Bruce

Regards,
PS: we have published newest 1.16 revision. Just small tuning fix.

-- 
Vladimir Ivanov
Network Operations Center
OOO "Yandex"
t: +7 495 739-7000
f: +7 495 739-7070
@: noc at yandex.net (corporate)
  wawa at yandex-team.ru (personal)
www: www.yandex.ru
-- 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 2230 bytes
Desc: S/MIME Cryptographic Signature
Url : http://lists.freebsd.org/pipermail/freebsd-net/attachments/20071003/40d368dd/smime.bin


More information about the freebsd-net mailing list