igb+lagg: poor input performance

Mon Dec 20 16:39:40 UTC 2010

Hi!

I observe some kind of undetected bottle-neck with input traffic processing
when traffic goes in through both ports of dual-port 82576-based card
grouped to lagg1 (lacp mode). Ports are connected to Cisco 7606 and
there are over 800 vlans created on top of lagg1.

No output errors or buffer overflows at Cisco side.
Cisco distributes traffic over ports just fine.

Vlans carry PPPoE frames, I use mpd55 for PPPoE.
As number of connected users grows, input traffic and pps grow
up to 126Kpps and 560Mbps on lagg1 for 1500 active PPPoE links.

Then mrtg draws nearly horizontal lines despite of growing users number.
There is no congestion on uplink (lagg0). This system has 4-core Xeon E5507
at 2.27Ghz, hyper-threading disabled. All cores are nearly evenly loaded
at level 60-65% only (almost interrupts). Horizontal lines are seen
for distinct processor core's load, input traffic for lagg1 and pps for lagg1.
I expect much more input traffic.

I use 8.2-PRERELEASE/amd64 with latest igb(8) driver version 2.0.7.
It has 4G RAM, over 3GB are free. sysctl net.inet.ip.intr_queue_drops
shows zero value.

/boot/loader.conf:

vm.kmem_size=3G
# for igb(4)
hw.igb.rxd=4096
hw.igb.txd=4096
# for lagg(4)
net.link.ifqmaxlen=10240
# for rtsock
net.route.netisr_maxqlen=4096
# for ???
net.isr.defaultqlimit=4096

/etc/sysctl.conf:

# netisr input queue size
net.inet.ip.intr_queue_maxlen=10240

net.inet.ip.fastforwarding=1
net.inet.ip.dummynet.pipe_slot_limit=1000
net.inet.ip.dummynet.io_fast=1

net.isr.direct=0
net.isr.direct_force=0

dev.igb.0.rx_processing_limit=4096
dev.igb.1.rx_processing_limit=4096

kern.ipc.nmbclusters=100000
kern.ipc.nmbjumbop=100000
kern.ipc.maxsockbuf=83886080

net.graph.maxdgram=8388608
net.graph.recvspace=8388608

Where should I dig first?

Eugene Grosbein