Proposed 6.2 em RELEASE patch

Sat Nov 11 12:05:18 UTC 2006

At 01:42 AM 11/11/2006, Scott Long wrote:

>surprised by your results.  I'm still a bit unclear on the exact
>topology of your setup, so if could explain it some more in private
>email, I'd appreciate it.

Hi,
         I made a quick diagram of the test setup that should make it 
more clear

http://www.tancsa.com/blast.jpg

Basically 5 boxes (plus my workstation for out of band access), the 
main one being tested is the box marked R2 which has a 2 port PCIe em 
NIC (Pro 1000PT) in the motherboard's 4X slot.  I have 2 test boxes 
as UDP senders and 2 test boxes as UDP receivers, and all the packets 
flow through the 2 interfaces of R2.  With one stream of packets 
being blasted across, the box is dropping some packets even on its 
OOB management interface. With 2, its totally unresponsive.  Only 
with polling am I able to continue to work on the box via the OOB 
interface while one and even 2 streams of UDP packets are blasting 
across.  However, in polling mode some amount of packets are being 
dropped and I guess I need to better understand how many.  My goal in 
all this is to have a firewall / router that can withstand a high pps 
workload that will still be reachable OOB when under attack or even 
under high workload.

To measure how many packets are dropped I was looking at making a 
modified netreceive to count the packets it gets so I can test to see 
if polling mode will be adequate for my needs.

Lets say the max pps the box can handle is X, either in polling or 
non polling modes.  As the box approaches X and gets pushed beyond X, 
I guess the ideal situation for my needs would be that it drops some 
packets on the busiest interface so that it can still function and 
service its other needs, be that network, disk, whatever. But my 
question is, is X the same for polling and non polling modes.

>For the short term, I don't think that there is anything that can be
>magically tweaked that will safely give better results.  I know that
>Gleb has some ideas on a fairly simple change for the non-INTR_FAST,
>non-POLLING case, but I and several others worry that it's not robust
>in the face of real-world network problems.
>
>For the long term, I have a number of ideas for improving both the RX
>and TX paths in the driver.  Some of it is specific to the if_em driver,
>some involve improvements in the FFWD and PFIL_HOOKS code as well as the
>driver.  What will help me is if you can hook up a serial console to
>your machine and see if it can be made to drop to the debugger while it
>is under load and otherwise unresponsive.  If you can, getting a process
>dump might help confirm where each CPU is spending its time.

Yes, I will see what I can do over the weekend. I have some changes 
to babysit again tomorrow night and will see what I can do between cycles.

         ---Mike