Server gets a high load, but no CPU use, and then later stops respond on the network

Anton Yuzhaninov citrin+bsd at citrin.ru
Tue Sep 20 20:57:17 UTC 2016


On 2016-09-13 19:23, Stxe5le Bordal Kristoffersen wrote:
> about once a day, but not in any pattern, it starts getting a load of 5-10
> and usually stops responding over the network before I notice it.

Does it stop responding completely (including ping) or only some 
services and ssh doesn't respond?

> From googling a bit, I have tried to disable msix on the igb network
> interface, and increased the nmbclusters with no apparent change in behaviour.
> (kern.ipc.nmbclusters="1000000" and hw.igb.enable_msix=0 in loader.conf)

kern.ipc.nmbclusters on modern FreeBSD version autotuned to very big 
value and manual increasing is rarely need.

Disabling msix on igb is also unlikely need.

> All I see is that the igb0 taskq pid is almost always in the RUN state when
> the machine is having trouble.

There is no igb0 taskq in top output below.

To see and inspect how top output looks when machine stops responding it 
is useful to run top from cron and log output.

Example script for top logging:
https://bitbucket.org/snippets/citrin/BpeXb

In top output you should look at WCPU and STATE for kernel threads and 
for unresponding network daemons.

Also do you have network load graph (bytes and packets per second) for 
this host (I saw munin in process list) - may be load is too high in 
moments when host not responding.

Do you use firewalls or netgraph? Which is the primary function of this 
server?


More information about the freebsd-questions mailing list