Proposed 6.2 em RELEASE patch

Mihail Balikov misho at interbgc.com
Sat Nov 11 10:06:48 UTC 2006


Our routers are with 2 em NICs, doing about 100kkps.  Without kernel polling
system become unstable, it seems that default interrupts moderation on em is
10000 intr/sec.

I have made some modification in kernel to decrease packet drops when
polling is enabled
- modify clock routines to allow high clock rate only for polling (5000
polls/sec)
- remove netisr_poll_more
- loop in ether_poll() untill there's ready packets

Small optimization in em():
- add CSUM_IP checksum offloading on transmit
- call em_txeof if there's more than 32 busy packet
- remove E1000_TXD_CMD_RS (Report Status) and check (TDH ==
adapter->oldest_used_tx_desc). This should reduce PCI overhead, but adds one
more PCI read on every em_txoef() call.

OS: FreeBSD 4.11, em() is almost up to date with HEAD.


----- Original Message ----- 
From: "Scott Long" <scottl at samsco.org>
To: "Mike Tancsa" <mike at sentex.net>
Cc: "freebsd-net" <freebsd-net at freebsd.org>; <freebsd-stable at freebsd.org>;
"Jack Vogel" <jfvogel at gmail.com>
Sent: Saturday, November 11, 2006 8:42 AM
Subject: Re: Proposed 6.2 em RELEASE patch


> Mike Tancsa wrote:
> > At 05:00 PM 11/10/2006, Jack Vogel wrote:
> >> On 11/10/06, Mike Tancsa <mike at sentex.net> wrote:
> >>>
> >>> Some more tests. I tried again with what was committed to today's
> >>> RELENG_6. I am guessing its pretty well the same patch.  Polling is
> >>> the only way to avoid livelock at a high pps rate.  Does anyone know
> >>> of any simple tools to measure end to end packet loss ? Polling will
> >>> end up dropping some packets and I want to be able to compare.  Same
> >>> hardware from the previous post.
> >>
> >> The commit WAS the last patch I posted. SO, making sure I understood
you,
> >> you are saying that POLLING is doing better than FAST_INTR, or only
> >> better than the legacy code that went in with my merge?
> >
> > Hi,
> > The last set of tests I posted are ONLY with what is in today's
> > RELENG_6-- i.e. the latest commit. I did a few variations on the
> > driver-- first with
> > #define EM_FAST_INTR 1
> > in if_em.c
> >
> > one without
> >
> > and one with polling in the kernel.
> >
> > With a decent packet rate passing through, the box will lockup.  Not
> > sure if I am just hitting the limits of the PCIe bus, or interrupt
> > moderation is not kicking in, or this is a case of "Doctor, it hurts
> > when I send a lot of packets through"... "Well, dont do that"
> >
> > Using polling prevents the lockup, but it will of course drop packets.
> > This is for firewalls with a fairly high bandwidth rate, as well as I
> > need it to be able to survive a decent DDoS attack.  I am not looking
> > for 1Mpps, but something more than 100Kpps
> >
> >         ---Mike
>
> Hi,
>
> Thanks for all of the data.  I know that a good amount of testing was
> done with single stream stress tests, but it's not clear how much was
> done with multiple streams prior to your efforts.  So, I'm not terribly
> surprised by your results.  I'm still a bit unclear on the exact
> topology of your setup, so if could explain it some more in private
> email, I'd appreciate it.
>
> For the short term, I don't think that there is anything that can be
> magically tweaked that will safely give better results.  I know that
> Gleb has some ideas on a fairly simple change for the non-INTR_FAST,
> non-POLLING case, but I and several others worry that it's not robust
> in the face of real-world network problems.
>
> For the long term, I have a number of ideas for improving both the RX
> and TX paths in the driver.  Some of it is specific to the if_em driver,
> some involve improvements in the FFWD and PFIL_HOOKS code as well as the
> driver.  What will help me is if you can hook up a serial console to
> your machine and see if it can be made to drop to the debugger while it
> is under load and otherwise unresponsive.  If you can, getting a process
> dump might help confirm where each CPU is spending its time.
>
> Scott
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
>



More information about the freebsd-net mailing list