kern/127024: Problem with unix sockets garbage collector
citrin at citrin.ru
Mon Sep 1 15:10:02 UTC 2008
>Synopsis: Problem with unix sockets garbage collector
>Arrival-Date: Mon Sep 01 15:10:00 UTC 2008
>Originator: Anton Yuzhaninov
>Release: FreeBSD 7.0-STABLE amd64
System: FreeBSD mx22.rambler.ru 7.0-STABLE FreeBSD 7.0-STABLE #1: Fri Jun 27 16:59:59 MSD 2008 root at mx22.rambler.ru:/usr/obj/usr/src/sys/MAIL amd64
Problem occurs on SMP boxes, when unix sockets used under high load.
In our case it is server with postfix MTA, where unix sockets used for IPC.
1. Normal work (after reboot):
thread taskq in top is about 0.00% WCPU
sysctl net.local.inflight is almost always zero.
sysctl net.local.taskcount value increased rarely.
2. After several days of work thread taskq starts to eat all available CPU:
1684 processes:26 running, 1639 sleeping, 19 waiting
CPU states: 6.7% user, 0.0% nice, 54.5% system, 1.1% interrupt, 37.7% idle
Mem: 1332M Active, 1903M Inact, 505M Wired, 118M Cache, 214M Buf, 76M Free
Swap: 2060M Total, 2060M Free
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
9 root 1 8 - 0K 16K CPU1 1 536:07 100.00% thread taskq
12 root 1 171 ki31 0K 16K RUN 0 53.5H 64.06% idle: cpu0
11 root 1 171 ki31 0K 16K RUN 1 50.3H 14.26% idle: cpu1
sysctl net.local.inflight value is always less then 0 (I see values from -1 to -4).
sysctl net.local.taskcount values increased with high rate (about 100 per second).
It seems to be some race in unix sockets code, because on uniprocessor box we can't repeat this.
Run postfix MTA on high loaded mail server (> 100 connects per second) with 6-stable or 7-stable (SMP).
Problem should occurs after several days (weeks) of uptime.
Not known yet.
May be in 8-current this problem fixed, but we can't run 8-current on this hardware.
More information about the freebsd-bugs