6.x, 4.x ipfw/dummynet pf/altq - network performance issues

Justin Robertson justin at sk1llz.net
Wed Feb 7 09:48:07 UTC 2007

  It was suggested I post this to freebsd-performance, it's already in 
questions, isp, and net.

I've been running some tests with using FreeBSD to filter and rate limit 
traffic. My first thoughts were to goto the latest stable release, which 
was 6.1 at the time. I've since done the same test under 6.2 and haven't 
seen any difference. I later migrated to running 4.11 to get away from 
these issues, but have discovered others.

I've tested on an AMD 3200+ system with dual Intel 1000 series NICs, an 
AMD Opteron 165 with the same, and a Xeon 2.8 with the same. I've used 
both stock and intel drivers.

Normal traffic isn't a problem. The second you get into the realm of 
abusive traffic, such a DoS/DDoS (over 100mbps) UDP floods the machine 
falls over. Little packets with ip lengths of 28-29 bytes seem to do the 
most damage. I've tried playing with various sysctl values and have seen 
no difference at all. By "falls over" I mean "stops sending all traffic 
in any direction". TCP syn packets have the same effect, tho not quite 
as rapidly (200~230mbps). I then tried moving filtering off to a 
transparent bridge. This improved the situation somewhat, but an extra 
30-40mbps of UDP data and it would ultimately crumble. Overall the 
machine would be able to move between 300k-600k PPS before becoming a 
cripple, depending on packet length, protocol, and any flags. Without a 
specific pf or ipfw rule to deal with a packet the box would fall over, 
with specific block rules it would manage an extra 30-40mbps and then 
fall over.

Again, normal traffic isn't a problem. When routing & filtering on the 
same system some of the problems found in 6.x are still apparent, but to 
a lesser degree. Splitting the task into a transparent filtering bridge 
with a separate routing box appears to clear it up entirely. UDP floods 
are much better handled - an ipfw block rule for the packet type and the 
machine responds as if there were no flood at all (until total bandwidth 
saturation or PPS limits of the hardware, which in this case was around 
950Mbps). TCP syn attacks are also better handled, again a block rule 
makes it seem as if there were no attack at all. The system also appears 
to be able to move 800-900k PPS of any one protocol at a time. However, 
the second you try and queue abusive traffic the machine will fall over. 
Inbound floods appear to cause ALL inbound traffic to lag horrifically 
(while rate limiting/piping), which inherently causes a lot of outbound 
loss due to broken TCP. Now, I'm not sure if this is something to do 
with dummynet being horribly inefficient, or if there's some sysctl 
value to deal with inbound that I'm missing.

I suppose my concerns are two-fold. Why is 6.x collapsing under traffic 
that 4.11 could easily block and run merrily along with, and is there a 
queueing mechanism in place that doesn't tie up the box so much on 
inbound flows that it ignores all other relevant traffic?

(as a note, all tests were done with device polling enabled. Without it 
systems fall over pretty quickly. I also tried tests using 3com cards 
and had the same results)

In the event anybody is looking for basic errors, device polling is 
enabled and running at 4000 hz, which has proved to net the highest 
thruput in PPS. ADAPTIVE_GIANT is on (tests resulted in better pps 
thruput), all the other monitoring features are disabled, and here are 
my sysctl modifications related to networking (if there's something 
glaring let me know!);

net.inet.udp.recvspace=8192    (I've tried large and small, thinking 
perhaps I was fulling up buffers on udp floods and then causing it to 
drop tcp, there appears to be no difference)
net.inet.tcp.newreno=0 (I'd try this, but, the biggest problem is still 
with UDP, and I'd prefer something compatible with everything for now)
net.link.ether.bridge.enable=0 (or 1 on when setup to bridge, obviously)

It has also been pointed out that using net.link.ether.ipfw=1 should 
negate the need for a transparent box, however the performance disparity 
between 6.x and 4.11 remains.

More information about the freebsd-performance mailing list