Proposed 6.2 em RELEASE patch

Sat Nov 11 06:42:43 UTC 2006

Mike Tancsa wrote:
> At 05:00 PM 11/10/2006, Jack Vogel wrote:
>> On 11/10/06, Mike Tancsa <mike at sentex.net> wrote:
>>>
>>> Some more tests. I tried again with what was committed to today's
>>> RELENG_6. I am guessing its pretty well the same patch.  Polling is
>>> the only way to avoid livelock at a high pps rate.  Does anyone know
>>> of any simple tools to measure end to end packet loss ? Polling will
>>> end up dropping some packets and I want to be able to compare.  Same
>>> hardware from the previous post.
>>
>> The commit WAS the last patch I posted. SO, making sure I understood you,
>> you are saying that POLLING is doing better than FAST_INTR, or only
>> better than the legacy code that went in with my merge?
> 
> Hi,
> The last set of tests I posted are ONLY with what is in today's 
> RELENG_6-- i.e. the latest commit. I did a few variations on the 
> driver-- first with
> #define EM_FAST_INTR 1
> in if_em.c
> 
> one without
> 
> and one with polling in the kernel.
> 
> With a decent packet rate passing through, the box will lockup.  Not 
> sure if I am just hitting the limits of the PCIe bus, or interrupt 
> moderation is not kicking in, or this is a case of "Doctor, it hurts 
> when I send a lot of packets through"... "Well, dont do that"
> 
> Using polling prevents the lockup, but it will of course drop packets. 
> This is for firewalls with a fairly high bandwidth rate, as well as I 
> need it to be able to survive a decent DDoS attack.  I am not looking 
> for 1Mpps, but something more than 100Kpps
> 
>         ---Mike

Hi,

Thanks for all of the data.  I know that a good amount of testing was
done with single stream stress tests, but it's not clear how much was
done with multiple streams prior to your efforts.  So, I'm not terribly
surprised by your results.  I'm still a bit unclear on the exact
topology of your setup, so if could explain it some more in private
email, I'd appreciate it.

For the short term, I don't think that there is anything that can be
magically tweaked that will safely give better results.  I know that
Gleb has some ideas on a fairly simple change for the non-INTR_FAST,
non-POLLING case, but I and several others worry that it's not robust
in the face of real-world network problems.

For the long term, I have a number of ideas for improving both the RX
and TX paths in the driver.  Some of it is specific to the if_em driver,
some involve improvements in the FFWD and PFIL_HOOKS code as well as the
driver.  What will help me is if you can hook up a serial console to
your machine and see if it can be made to drop to the debugger while it
is under load and otherwise unresponsive.  If you can, getting a process
dump might help confirm where each CPU is spending its time.

Scott