[Bug 203129] syslogd: /dev/console: Interrupted system call
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Wed Dec 20 17:14:49 UTC 2017
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=203129
--- Comment #10 from heas at shrubbery.net ---
I have noticed another regressive behavior from this:
Local system status:
3:01AM up 36 days, 7 hrs, 3 users, load averages: 165.55, 164.19, 161.10
I am forced to reboot at this point.
This is a headless server with a service processor and a serial console with
h/w flow control running 11.1-p4. A similar server without a head and no serial
console does not have this problem, nor does a headless 10.3 server, nor did
this server when it was running 10.3.
By process of elimination, I have found that removing syslog.conf lines that
log to /dev/console, this is avoided (or seemes to be after 2 days without
reoccurring).
What happens is that the daily cron newsyslog rotates files and HUPs syslogd.
At some point after that, usually a few hours, the load starts to climb. On
this server, it is many postgres processes driving that load in a tight
error/retry loop. Postgres is configured to lot to syslog. That loop traces
as:
96701 postgres CALL sendto(0x7,0x7fffffffceb0,0x9d,0,0,0)
96701 postgres RET sendto -1 errno 55 No buffer space available
96701 postgres CALL nanosleep(0x7fffffffc980,0)
96701 postgres RET nanosleep 0
96701 postgres CALL sendto(0x7,0x7fffffffceb0,0x9d,0,0,0)
96701 postgres RET sendto -1 errno 55 No buffer space available
96701 postgres CALL nanosleep(0x7fffffffc980,0)
I failed to take note of what FD 7 is.
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
96701 pgsql 1 48 0 179M 144M RUN 9 102:21 9.47% postgres
40970 pgsql 1 48 0 179M 144M RUN 8 69:55 9.08% postgres
44637 pgsql 1 48 0 179M 144M RUN 10 67:07 8.98% postgres
70361 pgsql 1 48 0 179M 144M RUN 7 77:29 8.59% postgres
......
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
pgsql 32593 9.1 0.2 182788 147740 - Rs 22:50 95:11.30 postgres:
gld gld 127.0.0.1(57118) (postgres)
pgsql 80018 9.0 0.2 182788 147932 - Rs 00:26 74:00.04 postgres:
gld gld 127.0.0.1(18337) (postgres)
pgsql 67956 8.9 0.2 182788 147940 - Rs 00:26 73:59.81 postgres:
gld gld 127.0.0.1(18309) (postgres)
.....
Active Internet connections (including servers)
Proto Recv-Q Send-Q Local Address Foreign Address (state)
gld to postfix:
tcp4 0 0 127.0.0.1.5432 127.0.0.1.50479 ESTABLISHED
tcp4 0 0 127.0.0.1.50479 127.0.0.1.5432 ESTABLISHED
tcp4 0 0 127.0.0.1.2525 127.0.0.1.50478 CLOSE_WAIT
tcp4 0 0 127.0.0.1.5432 127.0.0.1.29439 ESTABLISHED
tcp4 0 0 127.0.0.1.29439 127.0.0.1.5432 ESTABLISHED
tcp4 0 0 127.0.0.1.2525 127.0.0.1.29438 CLOSE_WAIT
tcp4 0 0 127.0.0.1.5432 127.0.0.1.37203 CLOSED
tcp4 0 0 127.0.0.1.5432 127.0.0.1.31279 CLOSED
tcp4 0 0 127.0.0.1.5432 127.0.0.1.31277 CLOSED
tcp4 0 0 127.0.0.1.5432 127.0.0.1.35576 CLOSED
......
these would be postfix to gld:
tcp4 596 0 127.0.0.1.2525 127.0.0.1.37998 CLOSE_WAIT
tcp4 627 0 127.0.0.1.2525 127.0.0.1.37994 CLOSE_WAIT
tcp4 627 0 127.0.0.1.2525 127.0.0.1.37993 CLOSE_WAIT
tcp4 602 0 127.0.0.1.2525 127.0.0.1.37992 CLOSE_WAIT
tcp4 605 0 127.0.0.1.2525 127.0.0.1.37988 CLOSE_WAIT
tcp4 605 0 127.0.0.1.2525 127.0.0.1.37987 CLOSE_WAIT
tcp4 605 0 127.0.0.1.2525 127.0.0.1.37986 CLOSE_WAIT
......
Active UNIX domain sockets
Address Type Recv-Q Send-Q Inode Conn
Refs Nextref Addr
fffff8012e26b690 dgram 0 0 fffff8012e2d3938 0
0 0 /var/named/var/run/log
fffff8012e26b780 dgram 49898 0 fffff8012e2d3b10 0
fffff808375de3c0 0 /var/run/logpriv
fffff8012e26b870 dgram 44120 0 fffff8012e2d3ce8 0
fffff81015ef8870 0 /var/run/log
fffff8012e2b51e0 seqpac 0 0 fffff80077b11938 0
0 0 /var/run/devd.seqpacket.pipe
....
Active Internet connections
Proto Recv-Q Send-Q Local Address Foreign Address R-MBUF S-MBUF
R-CLUS S-CLUS R-HIWA S-HIWA R-LOWA S-LOWA R-BCNT S-BCNT R-BMAX S-BMAX rexmt
persist keep 2msl delack rcvtime
tcp4 0 0 127.0.0.1.5432 127.0.0.1.38803 0 0
0 0 81720 49032 1 2048 0 0 653760 392256 0.00
0.00 7145.47 0.00 0.00 54.52
tcp4 0 0 127.0.0.1.38803 127.0.0.1.5432 0 0
0 0 81720 49032 1 2048 0 0 653760 392256 0.00
0.00 7145.57 0.00 0.00 54.42
tcp4 668 0 127.0.0.1.2525 127.0.0.1.54066 1 0
1 0 81720 49032 1 2048 2304 0 653760 392256 0.00
0.00 7145.47 0.00 0.00 54.52
tcp4 0 0 127.0.0.1.54066 127.0.0.1.2525 0 0
0 0 81720 49032 1 2048 0 0 653760 392256 0.00
0.00 7145.57 0.00 0.00 54.42
tcp4 668 0 127.0.0.1.2525 127.0.0.1.33204 1 0
1 0 81720 49032 1 2048 2304 0 653760 392256 0.00
0.00 7145.47 0.00 0.00 54.52
tcp4 0 0 127.0.0.1.33204 127.0.0.1.2525 0 0
0 0 81720 49032 1 2048 0 0 653760 392256 0.00
0.00 7145.47 545.47 0.00 54.52
.....
That is as far as I've analyzed this, but toward a vote to revert whatever
changed with console from 10.3.
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the freebsd-bugs
mailing list