6.x, 4.x ipfw/dummynet pf/altq - network performance issues

Thu Feb 8 10:19:21 UTC 2007

Hi,
       I think you can try to check/tuning this sysctl variables and the isr 
related variables:

net.inet.ip.intr_queue_maxlen
net.inet.ip.intr_queue_drops

net.isr.enable   ... try to set
net.isr.directed
net.isr.queued 
net.isr.drop

and polling configuration:

kern.clockrate

kern.polling.burst_max
....  increase for high rate of small packets on GE 
....

Alessandro

> Date: Wed, 07 Feb 2007 01:37:00 -0800
> From: Justin Robertson <justin at sk1llz.net>
> Subject: 6.x, 4.x ipfw/dummynet pf/altq - network performance issues
> To: freebsd-performance at freebsd.org
> Message-ID: <45C99DBC.1050402 at sk1llz.net>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> 
>   It was suggested I post this to freebsd-performance, it's already in 
> questions, isp, and net.
> 
> I've been running some tests with using FreeBSD to filter and rate limit 
> traffic. My first thoughts were to goto the latest stable release, which 
> was 6.1 at the time. I've since done the same test under 6.2 and haven't 
> seen any difference. I later migrated to running 4.11 to get away from 
> these issues, but have discovered others.
> 
> I've tested on an AMD 3200+ system with dual Intel 1000 series NICs, an 
> AMD Opteron 165 with the same, and a Xeon 2.8 with the same. I've used 
> both stock and intel drivers.
> 
> 6.x;
> Normal traffic isn't a problem. The second you get into the realm of 
> abusive traffic, such a DoS/DDoS (over 100mbps) UDP floods the machine 
> falls over. Little packets with ip lengths of 28-29 bytes seem to do the 
> most damage. I've tried playing with various sysctl values and have seen 
> no difference at all. By "falls over" I mean "stops sending all traffic 
> in any direction". TCP syn packets have the same effect, tho not quite 
> as rapidly (200~230mbps). I then tried moving filtering off to a 
> transparent bridge. This improved the situation somewhat, but an extra 
> 30-40mbps of UDP data and it would ultimately crumble. Overall the 
> machine would be able to move between 300k-600k PPS before becoming a 
> cripple, depending on packet length, protocol, and any flags. Without a 
> specific pf or ipfw rule to deal with a packet the box would fall over, 
> with specific block rules it would manage an extra 30-40mbps and then 
> fall over.
> 
> 4.11;
> Again, normal traffic isn't a problem. When routing & filtering on the 
> same system some of the problems found in 6.x are still apparent, but to 
> a lesser degree. Splitting the task into a transparent filtering bridge 
> with a separate routing box appears to clear it up entirely. UDP floods 
> are much better handled - an ipfw block rule for the packet type and the 
> machine responds as if there were no flood at all (until total bandwidth 
> saturation or PPS limits of the hardware, which in this case was around 
> 950Mbps). TCP syn attacks are also better handled, again a block rule 
> makes it seem as if there were no attack at all. The system also appears 
> to be able to move 800-900k PPS of any one protocol at a time. However, 
> the second you try and queue abusive traffic the machine will fall over. 
> Inbound floods appear to cause ALL inbound traffic to lag horrifically 
> (while rate limiting/piping), which inherently causes a lot of outbound 
> loss due to broken TCP. Now, I'm not sure if this is something to do 
> with dummynet being horribly inefficient, or if there's some sysctl 
> value to deal with inbound that I'm missing.
> 
> I suppose my concerns are two-fold. Why is 6.x collapsing under traffic 
> that 4.11 could easily block and run merrily along with, and is there a 
> queueing mechanism in place that doesn't tie up the box so much on 
> inbound flows that it ignores all other relevant traffic?
> 
> (as a note, all tests were done with device polling enabled. Without it 
> systems fall over pretty quickly. I also tried tests using 3com cards 
> and had the same results)
> 
> 
> In the event anybody is looking for basic errors, device polling is 
> enabled and running at 4000 hz, which has proved to net the highest 
> thruput in PPS. ADAPTIVE_GIANT is on (tests resulted in better pps 
> thruput), all the other monitoring features are disabled, and here are 
> my sysctl modifications related to networking (if there's something 
> glaring let me know!);
> 
> kern.polling.enable=1
> kern.polling.burst_max=1000
> kern.polling.each_burst=80
> kern.polling.idle_poll=1
> kern.polling.user_frac=20
> kern.polling.reg_frac=50
> net.inet.tcp.recvspace=262144
> net.inet.tcp.sendspace=262144
> kern.ipc.maxsockbuf=1048576
> net.inet.tcp.always_keepalive=1
> net.inet.ip.portrange.first=10000
> kern.ipc.somaxconn=65535
> net.inet.tcp.blackhole=2
> net.inet.udp.blackhole=1
> net.inet.icmp.icmplim=30
> net.inet.ip.forwarding=1
> net.inet.ip.portrange.randomized=0
> net.inet.udp.checksum=0
> net.inet.udp.recvspace=8192    (I've tried large and small, thinking 
> perhaps I was fulling up buffers on udp floods and then causing it to 
> drop tcp, there appears to be no difference)
> net.inet.ip.intr_queue_maxlen=512
> net.inet.tcp.delayed_ack=0
> net.inet.tcp.rfc1323=1
> net.inet.tcp.newreno=0 (I'd try this, but, the biggest problem is still 
> with UDP, and I'd prefer something compatible with everything for now)
> net.inet.tcp.delacktime=10
> net.inet.tcp.msl=2500
> net.inet.ip.rtmaxcache=1024
> net.inet.raw.recvspace=262144
> net.inet.ip.dummynet.hash_size=512
> net.inet.ip.fw.dyn_ack_lifetime=30
> net.inet.ip.fw.dyn_syn_lifetime=10
> net.inet.ip.fw.dyn_fin_lifetime=10
> net.inet.ip.fw.dyn_max=16192
> net.link.ether.bridge.enable=0 (or 1 on when setup to bridge, obviously)
> net.inet.ip.fastforwarding=1
> 
> It has also been pointed out that using net.link.ether.ipfw=1 should 
> negate the need for a transparent box, however the performance disparity 
> between 6.x and 4.11 remains.
>