6.x, 4.x ipfw/dummynet pf/altq - network performance issues

Thu Feb 15 19:43:48 UTC 2007

  Playing with these sysctl values made 0 difference - what's supposed 
to happen???

  Another scary discovery - if you've got 6.2 setup to route, even with 
static routes, 1Mbps of TCP SYN traffic will cause it to start dropping 
packets in every direction. Awesome. Methinks I'll be using 4.11 for a 
while. ;P

Justin Robertson wrote:
>
>  Clockrate is based off of my device_polling setup, which is 
> configured to 4000. burst_max has a hard limit, can't go higher than 
> it already is at 1000
>
>  Could I get an explanation as to what the queue and isr sysctl values 
> are actually doing? I'll be able to run some more basic tests tomorrow 
> to see some results, but want to wrap my head around what's actually 
> logically meant to be happening based on adjustments, etc. [I suspect 
> this'll do nothing for the UDP issue, but at least I might be able to 
> pipe some TCP traffic]
>
>
> garcol at postino.it wrote:
>> Hi,
>>        I think you can try to check/tuning this sysctl variables and 
>> the isr related variables:
>>
>> net.inet.ip.intr_queue_maxlen
>> net.inet.ip.intr_queue_drops
>>
>> net.isr.enable   ... try to set
>> net.isr.directed
>> net.isr.queued net.isr.drop
>>
>> and polling configuration:
>>
>> kern.clockrate
>>
>> kern.polling.burst_max
>> ....  increase for high rate of small packets on GE ....
>>
>> Alessandro
>>
>>
>>
>>
>>  
>>> Date: Wed, 07 Feb 2007 01:37:00 -0800
>>> From: Justin Robertson <justin at sk1llz.net>
>>> Subject: 6.x, 4.x ipfw/dummynet pf/altq - network performance issues
>>> To: freebsd-performance at freebsd.org
>>> Message-ID: <45C99DBC.1050402 at sk1llz.net>
>>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>>
>>>   It was suggested I post this to freebsd-performance, it's already 
>>> in questions, isp, and net.
>>>
>>> I've been running some tests with using FreeBSD to filter and rate 
>>> limit traffic. My first thoughts were to goto the latest stable 
>>> release, which was 6.1 at the time. I've since done the same test 
>>> under 6.2 and haven't seen any difference. I later migrated to 
>>> running 4.11 to get away from these issues, but have discovered others.
>>>
>>> I've tested on an AMD 3200+ system with dual Intel 1000 series NICs, 
>>> an AMD Opteron 165 with the same, and a Xeon 2.8 with the same. I've 
>>> used both stock and intel drivers.
>>>
>>> 6.x;
>>> Normal traffic isn't a problem. The second you get into the realm of 
>>> abusive traffic, such a DoS/DDoS (over 100mbps) UDP floods the 
>>> machine falls over. Little packets with ip lengths of 28-29 bytes 
>>> seem to do the most damage. I've tried playing with various sysctl 
>>> values and have seen no difference at all. By "falls over" I mean 
>>> "stops sending all traffic in any direction". TCP syn packets have 
>>> the same effect, tho not quite as rapidly (200~230mbps). I then 
>>> tried moving filtering off to a transparent bridge. This improved 
>>> the situation somewhat, but an extra 30-40mbps of UDP data and it 
>>> would ultimately crumble. Overall the machine would be able to move 
>>> between 300k-600k PPS before becoming a cripple, depending on packet 
>>> length, protocol, and any flags. Without a specific pf or ipfw rule 
>>> to deal with a packet the box would fall over, with specific block 
>>> rules it would manage an extra 30-40mbps and then fall over.
>>>
>>> 4.11;
>>> Again, normal traffic isn't a problem. When routing & filtering on 
>>> the same system some of the problems found in 6.x are still 
>>> apparent, but to a lesser degree. Splitting the task into a 
>>> transparent filtering bridge with a separate routing box appears to 
>>> clear it up entirely. UDP floods are much better handled - an ipfw 
>>> block rule for the packet type and the machine responds as if there 
>>> were no flood at all (until total bandwidth saturation or PPS limits 
>>> of the hardware, which in this case was around 950Mbps). TCP syn 
>>> attacks are also better handled, again a block rule makes it seem as 
>>> if there were no attack at all. The system also appears to be able 
>>> to move 800-900k PPS of any one protocol at a time. However, the 
>>> second you try and queue abusive traffic the machine will fall over. 
>>> Inbound floods appear to cause ALL inbound traffic to lag 
>>> horrifically (while rate limiting/piping), which inherently causes a 
>>> lot of outbound loss due to broken TCP. Now, I'm not sure if this is 
>>> something to do with dummynet being horribly inefficient, or if 
>>> there's some sysctl value to deal with inbound that I'm missing.
>>>
>>> I suppose my concerns are two-fold. Why is 6.x collapsing under 
>>> traffic that 4.11 could easily block and run merrily along with, and 
>>> is there a queueing mechanism in place that doesn't tie up the box 
>>> so much on inbound flows that it ignores all other relevant traffic?
>>>
>>> (as a note, all tests were done with device polling enabled. Without 
>>> it systems fall over pretty quickly. I also tried tests using 3com 
>>> cards and had the same results)
>>>
>>>
>>> In the event anybody is looking for basic errors, device polling is 
>>> enabled and running at 4000 hz, which has proved to net the highest 
>>> thruput in PPS. ADAPTIVE_GIANT is on (tests resulted in better pps 
>>> thruput), all the other monitoring features are disabled, and here 
>>> are my sysctl modifications related to networking (if there's 
>>> something glaring let me know!);
>>>
>>> kern.polling.enable=1
>>> kern.polling.burst_max=1000
>>> kern.polling.each_burst=80
>>> kern.polling.idle_poll=1
>>> kern.polling.user_frac=20
>>> kern.polling.reg_frac=50
>>> net.inet.tcp.recvspace=262144
>>> net.inet.tcp.sendspace=262144
>>> kern.ipc.maxsockbuf=1048576
>>> net.inet.tcp.always_keepalive=1
>>> net.inet.ip.portrange.first=10000
>>> kern.ipc.somaxconn=65535
>>> net.inet.tcp.blackhole=2
>>> net.inet.udp.blackhole=1
>>> net.inet.icmp.icmplim=30
>>> net.inet.ip.forwarding=1
>>> net.inet.ip.portrange.randomized=0
>>> net.inet.udp.checksum=0
>>> net.inet.udp.recvspace=8192    (I've tried large and small, thinking 
>>> perhaps I was fulling up buffers on udp floods and then causing it 
>>> to drop tcp, there appears to be no difference)
>>> net.inet.ip.intr_queue_maxlen=512
>>> net.inet.tcp.delayed_ack=0
>>> net.inet.tcp.rfc1323=1
>>> net.inet.tcp.newreno=0 (I'd try this, but, the biggest problem is 
>>> still with UDP, and I'd prefer something compatible with everything 
>>> for now)
>>> net.inet.tcp.delacktime=10
>>> net.inet.tcp.msl=2500
>>> net.inet.ip.rtmaxcache=1024
>>> net.inet.raw.recvspace=262144
>>> net.inet.ip.dummynet.hash_size=512
>>> net.inet.ip.fw.dyn_ack_lifetime=30
>>> net.inet.ip.fw.dyn_syn_lifetime=10
>>> net.inet.ip.fw.dyn_fin_lifetime=10
>>> net.inet.ip.fw.dyn_max=16192
>>> net.link.ether.bridge.enable=0 (or 1 on when setup to bridge, 
>>> obviously)
>>> net.inet.ip.fastforwarding=1
>>>
>>> It has also been pointed out that using net.link.ether.ipfw=1 should 
>>> negate the need for a transparent box, however the performance 
>>> disparity between 6.x and 4.11 remains.
>>>
>>>     
>> _______________________________________________
>> freebsd-performance at freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
>> To unsubscribe, send any mail to 
>> "freebsd-performance-unsubscribe at freebsd.org"
>>
>>   
>
> _______________________________________________
> freebsd-performance at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
> To unsubscribe, send any mail to 
> "freebsd-performance-unsubscribe at freebsd.org"
>

-- 
Justin