Highly loaded machine getting slower and slower

Lukas Ertl l.ertl at univie.ac.at
Mon Jul 28 06:35:03 PDT 2003

Hi there,

I'm having again problems with a highly loaded 5.1-current machine.  The
box is a 2.4GHz Dual Xeon (HTT enabled) with 1GB RAM and acts as a news
server/feeder running diablo.  It's pumping out 120+Mbit/sec over Gigabit
without a glitch, but after some time, it's getting slower and slower,
until it seems to completely freeze, but it's still alive, just _very_
unresponsive and in fact has to be rebooted.

A kernel without WITNESS checks survives a few hours, a kernel with
WITNESS and friends stays up longer, but in fact after one, two weeks it's
the same picture.

If the machine seems to be stuck again and you break into the debugger,
you always get something like this:

db> where
_mtx_lock_sleep(c03fa6f0,0,0,0,ffffffff) at _mtx_lock_sleep+0x1e6
msleep(c21be0ec,c03fa6f0,44,c03ad35b,0) at msleep+0x888
acquire(e3a81a38,1000000,600,11000,c6f23d10) at acquire+0xbe
lockmgr(c21be0ec,2,0,c6f23d10,11000) at lockmgr+0x3f7
_vm_map_lock(c21be0b0,0,0,e3a81a7c,e3a81a84) at _vm_map_lock+0x5d
kmem_alloc_wait(c21be0b0,11000,c6f2b4b0,c1618378,120) at
kern_execve(c6f23d10,bfbff410,bfbff2fc,bfbffd60,0) at kern_execve+0x219
execve(c6f23d10,e3a81d10,c,c022c1e6,3) at execve+0x30
syscall(2f,2f,2f,bfbff490,bfbff2fc) at syscall+0x2b0
Xint0x80_syscall() at Xint0x80_syscall+0x1d
--- syscall (59, FreeBSD ELF32, execve), eip = 0x481132df, esp =
0xbfbff2ec, ebp = 0      378 ---

The machine is running about 250+ concurrent diablo/dnewslink processes.

Any hints or ideas?


