High interrupt rate on a PF box + performance

Thu Jan 27 20:38:25 UTC 2011

On 1/27/11 8:57 PM, Jeremy Chadwick wrote:
> On Thu, Jan 27, 2011 at 08:39:40PM +0100, Damien Fleuriot wrote:
>>
>>
>> On 1/27/11 7:46 PM, Sergey Lobanov wrote:
>>> В сообщении от Пятница 28 января 2011 00:55:35 автор Damien Fleuriot написал:
>>>> On 1/27/11 6:41 PM, Vogel, Jack wrote:
>>>>> Jeremy is right, if you have a problem the first step is to try the
>>>>> latest code.
>>>>>
>>>>> However, when I look at the interrupts below I don't see what the problem
>>>>> is? The Broadcom seems to have about the same rate, it just doesn't have
>>>>> MSIX (multiple vectors).
>>>>>
>>>>> Jack
>>>>
>>>> My main concern is that the CPU %interrupt is quite high, also, we seem
>>>> to be experiencing input errors on the interfaces.
>>> Would you show igb tuning which is done in loader.conf and output of sysctl 
>>> dev.igb.0?
>>> Did you rise number of igb descriptors such as:
>>> hw.igb.rxd=4096
>>> hw.igb.txd=4096 ?
>>
>> There is no tuning at all on our part in the loader's conf.
>>
>> Find below the sysctls:
>>
>> # sysctl -a |grep igb
>> dev.igb.0.%desc: Intel(R) PRO/1000 Network Connection version - 1.7.3
>> dev.igb.0.%driver: igb
>> dev.igb.0.%location: slot=0 function=0
>> dev.igb.0.%pnpinfo: vendor=0x8086 device=0x10d6 subvendor=0x8086
>> subdevice=0x145a class=0x020000
>> dev.igb.0.%parent: pci14
>> dev.igb.0.debug: -1
>> dev.igb.0.stats: -1
>> dev.igb.0.flow_control: 3
>> dev.igb.0.enable_aim: 1
>> dev.igb.0.low_latency: 128
>> dev.igb.0.ave_latency: 450
>> dev.igb.0.bulk_latency: 1200
>> dev.igb.0.rx_processing_limit: 100
>> dev.igb.1.%desc: Intel(R) PRO/1000 Network Connection version - 1.7.3
>> dev.igb.1.%driver: igb
>> dev.igb.1.%location: slot=0 function=1
>> dev.igb.1.%pnpinfo: vendor=0x8086 device=0x10d6 subvendor=0x8086
>> subdevice=0x145a class=0x020000
>> dev.igb.1.%parent: pci14
>> dev.igb.1.debug: -1
>> dev.igb.1.stats: -1
>> dev.igb.1.flow_control: 3
>> dev.igb.1.enable_aim: 1
>> dev.igb.1.low_latency: 128
>> dev.igb.1.ave_latency: 450
>> dev.igb.1.bulk_latency: 1200
>> dev.igb.1.rx_processing_limit: 100
> 
> I'm not aware of how to tune igb(4), so the advice Sergey gave you may
> be applicable.  You'll need to schedule downtime to adjust those
> tunables however (since a reboot will be requried).
> 
> I also reviewed the munin graphs.  I don't see anything necessarily
> wrong.  However, you omitted yearly graphs for the network interfaces.

Indeed I have, the reason is because the yearly graphs are fucked up,
for some reason that eludes me munin recorded a 2petabyte spike sometime
during september or so.

So this makes the whole graph flatlined for the year -.-

However, we clearly have an increase in traffic, as we may also see from
our nginx requests graphs.

> Why I care about that:
> 
> The pf state table (yearly) graph basically correlates with the CPU
> usage (yearly) graph, and I expect that the yearly network graphs would
> show a similar trend: an increase in your overall traffic over the
> course of a year.
> 
> What I'm trying to figure out is what you're concerned about.  You are
> in fact pushing anywhere between 60-120MBytes/sec across these
> interfaces.  Given those numbers, I'm not surprised by the ""high""
> interrupt usage.
> 

I'm worried we may hit a bottleneck soon.
I was also hoping for some kind of magical way to diminish the
interrupts so we could get more performance from the machines.

> Graphs of this nature usually indicate that you're hitting a
> "bottleneck" (for lack of better word) where you're simply doing "too
> much" with a single machine (given its network throughput).  The machine
> is spending a tremendous amount of CPU time handling network traffic,
> and equally as much with regards to the pf usage.
> 

We've indeed been thinking about moving to an active-active setup for
some time already, guess it'll have to happen sooner rather than later :)

> If you want my opinion based on the information I have so far, it's
> this: you need to scale your infrastructure.  You can no longer rely on
> a single machine to handle this amount of traffic.
> 
> As for the network errors you see -- to get low-level NIC and driver
> statistics, you'll need to run "sysctl dev.igb.X.stats=1" then run
> "dmesg" and look at the numbers shown (the sysctl command won't output
> anything itself).  This may help indicate where the packets are being
> lost.  You should also check the interface counters on the switch which
> these interfaces are connected to.  I sure hope it's a managed switch
> which can give you those statistics.
> 
> Hope this helps, or at least acts as food for thought.
> 

Aye, will try that.

We're also considering moving to faster machines but I don't think that
will help much with our problem.

I suppose additional CPU cores will be of no help at all, considering
the kernel is single threaded and runs on cpu0 only ?

Actually, I assume it might even be detrimental to us to add more cores,
since they'll spend more time interrupting each other ?

Thanks for sharing your thoughts :)