increasing em(4) buffer sizes
    rihad 
    rihad at mail.ru
       
    Thu May 20 04:33:53 UTC 2010
    
    
  
On 05/20/2010 12:05 AM, Eugene Grosbein wrote:
> On Wed, May 19, 2010 at 10:51:43PM +0500, rihad wrote:
>
>> We have a FreeBSD 7.2 Intel Server System 4GB RAM box doing traffic
>> shaping and accounting. It has two em gigabit interfaces: one used for
>> input, the other for output, servicing around 500-600 mbps load through
>> it. Traffic limiting is accomplished by dynamically setting up IPFW
>> pipes, which in turn work fine for our per-user traffic accounting needs
>> thanks to byte counters. So the firewall is basically a longish string
>> of pipe rules. This worked fine when the number of online users was low,
>> but now, as we've slowly begun servicing 2-3K online users netstat -i's
>> Ierrs column is growing at a rate of 5-15K per hour for em0, the
>> interface used for input. Apparently searching through the firewall
>> linearly for _each_ arriving packet locks the interface for the duration
>> of the search (even though net.isr.direct=0), so some packets are
>> periodically dropped on input. To mitigate the problem I've set up a
>> two-level hash by means of skipto rules, dropping the number of up to
>> several thousand rules to be searched for each packet to a mere 85 max,
>> but the rate of Ierrs has only increased to 40-50K per hour, I don't
>> know why. I've also tried setting these sysctls:
>
> First, read: http://www.intel.com/design/network/applnots/ap450.htm
> You'll see you may be restricted with your NIC's chip capabilities.
>
Likely sooner than later these cards will be upgraded to 10 GigE ones, I 
just want to make sure that the delays imposed by traversing the 
firewall never cause traffic drops on input.
> There are loader tunnables, set them in /etc/loader.conf:
Do you mean /boot/loader.conf ?
>
> hw.em.rxd=4096
> hw.em.txd=4096
>
BTW, I can't read the current value:
$ sysctl hw.em.rxd
sysctl: unknown oid 'hw.em.rxd'
$
Is this a write-only value? :)
> The price is amount of kernel memory the driver may consume.
> Maxumum MTU=16110 for em(4), so it can consume about 64Mb of kernel memory
> for that long input buffer, in theory.
>
> Some more useful tunnables for loader.conf:
>
> dev.em.0.rx_int_delay=200
> dev.em.0.tx_int_delay=200
> dev.em.0.rx_abs_int_delay=200
> dev.em.0.tx_abs_int_delay=200
> dev.em.0.rx_processing_limit=-1
>
So this interrupt delay is the much talked about interrupt moderation? 
Thanks, I'll try them. Is there any risk the machine won't boot with 
them if rebooted remotely?
> Alternatively, you may try kernel polling (ifconfig em0 polling)
> with other tunnables:
>
> kern.hz=4000				# for /boot/loader.conf
> kern.polling.burst_max=1000		# for /etc/sysctl.conf
> kern.polling.each_burst=500
>
Wow, I successfully used polling a couple of years ago when the load was 
low, but then I read some posting on this list claiming that Intel cards 
have the ability to do fast-interrupts (interrupt moderation), but for 
that DEVICE_POLLING needs to be out of the kernel. So I scratched it and 
rebuilt the kernel for no apparent reason. Maybe you're right, polling 
would've worked just fine, so I may go back to that too.
> Eugene Grosbein
>
>
    
    
More information about the freebsd-net
mailing list