netisr+lagg+fragments=80% packet loss
egrosbein at rdtc.ru
Fri Feb 24 17:14:09 UTC 2012
I've found that my PPPoE BRAS server gives 80% packet loss
for fragmented pings against itself while no packet loss for transit fragmented pings same time.
The problem vanishes if I do ONE of following:
1) disable indirect netisr mode, e.g. set back sysctl net.isr.direct=1 AND
net.isr.direct_force=1 (net.isr.direct=1 only is not enough to eliminate problem), OR
2) bring down one of two lagg ports, OR
3) decrease ping packets size so they stop get fragmented.
More details. There is FreeBSD 8.2-STABLE/amd64 PPPoE server with mpd-5.5
and 4 NICs: em0 and em1 combined to lagg0 (LACP mode) having IP address
that carries untagged IP packets; plus, igb0 and igb1 combined to lagg1
(againg, LACP mode) having no IP addresses - instead, there are about 1000
vlan interfaces built on top of lagg1. All vlan interfaces carry PPPoE traffic only.
I use another FreeBSD 8.2-STABLE system as PPPoE client (again, mpd-5.5),
it connects to BRAS just fine and sends/receives transit traffic just fine.
It also pings BRAS with packets up to 1492 (MTU on ng0 interface).
But, "ping -s 1472" (1500 bytes at IP layer) gives 80% drops.
tcpdump -i ng0 shows that EVERY fragmented outgoing ICMP echo-request
gets its echo-reply but only small part of replies get in order:
fragment 0 first, fragment 1 next.
Most of replies get back out of order: fragment 1 first, fragment 0 second.
These produce no lines in ping's output and do increment
'fragments dropped after timeout' counter in "netstat -ss" ouput.
However, tcpdump shows they are received at once, no significant delay.
This problem occurs only when net.isr.direct=0/net.isr.direct_force=0.
And only when lagg1 has both ports up and running. And when I use oversized pings.
At the same time, transit oversized pings go through this BRAS just fine,
no packet loss at all.
More information about the freebsd-net