[Bug 204426] Processes terminating cannot access memory
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Sun May 8 14:32:53 UTC 2016
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204426
--- Comment #108 from Robert Blayzor <rblayzor at inoc.net> ---
It's been longer than average now and I have not run into the processes
terminating abnormally with the last patch installed against 10.3. HOWEVER, I
have noticed a new issue with the network stack that seems to be happening at
roughly the same interval. I'm not sure if the two are related or if fixing one
problem manifested into another.
Basically now we're getting the severs filling up with TCP connections stuck in
a "CLOSED" state. We'll end up getting thousands of them until connections to
the processes just time out.
Sometimes we'll see kernel messages:
sonewconn: pcb 0xfffff80001b36930: Listen queue overflow: 767 already in queue
awaiting acceptance (46 occurrences)
sonewconn: pcb 0xfffff80001b36930: Listen queue overflow: 767 already in queue
awaiting acceptance (46 occurrences)
sonewconn: pcb 0xfffff80001b36930: Listen queue overflow: 767 already in queue
awaiting acceptance (50 occurrences)
sonewconn: pcb 0xfffff80001b36930: Listen queue overflow: 767 already in queue
awaiting acceptance (42 occurrences)
sonewconn: pcb 0xfffff80001b36930: Listen queue overflow: 767 already in queue
awaiting acceptance (44 occurrences)
sonewconn: pcb 0xfffff80001b36930: Listen queue overflow: 767 already in queue
awaiting acceptance (38 occurrences)
sonewconn: pcb 0xfffff80001b36930: Listen queue overflow: 767 already in queue
awaiting acceptance (40 occurrences)
[...]
But not always... Currently I have a server in this state and I'll see:
tcp6 0 0 mta1.imap mta-slb-1.alb1.i.20511 CLOSED
tcp6 0 0 mta1.pop3 mta-slb-1.alb1.i.43879 CLOSED
tcp6 0 0 mta1.imap mta-slb-0.alb1.i.5259 CLOSED
tcp6 0 0 mta1.pop3 mta-slb-0.alb1.i.12519 CLOSED
tcp6 0 0 mta1.imap mta-slb-1.alb1.i.1316 CLOSED
tcp6 0 0 mta1.pop3 mta-slb-1.alb1.i.65343 CLOSED
tcp6 0 0 mta1.imap mta-slb-0.alb1.i.16289 CLOSED
tcp6 0 0 mta1.pop3 mta-slb-0.alb1.i.19215 CLOSED
tcp6 32 0 mta1.sieve mta-slb-0.alb1.i.19549 CLOSED
tcp6 0 0 mta1.imap mta-slb-1.alb1.i.49287 CLOSED
tcp6 32 0 mta1.sieve mta-slb-1.alb1.i.54187 CLOSED
tcp6 0 0 mta1.pop3 mta-slb-1.alb1.i.39767 CLOSED
tcp6 0 0 mta1.imap mta-slb-0.alb1.i.54366 CLOSED
tcp6 0 0 mta1.pop3 mta-slb-0.alb1.i.47579 CLOSED
tcp6 0 0 mta1.imap mta-slb-1.alb1.i.48798 CLOSED
tcp6 0 0 mta1.pop3 mta-slb-1.alb1.i.40190 CLOSED
... [ 1000's of lines truncated ]
It's not just Dovecot either, we also will see several stuck in CLOSED from
Exim as well. So it doesn't look like an application issue, in fact, sockstat
shows these stuck sockets not related to the process anymore... ie:
? ? ? ? tcp6 2607:f058:110:2::1:1:143
2607:f058:110:2::f:1:56602
? ? ? ? tcp6 2607:f058:110:2::1:1:4190
2607:f058:110:2::f:0:32558
? ? ? ? tcp6 2607:f058:110:2::1:1:110
2607:f058:110:2::f:1:53931
? ? ? ? tcp6 2607:f058:110:2::1:1:110
2607:f058:110:2::f:0:58671
? ? ? ? tcp6 2607:f058:110:2::1:1:143
2607:f058:110:2::f:1:58788
? ? ? ? tcp6 2607:f058:110:2::1:1:143
2607:f058:110:2::f:0:30523
? ? ? ? tcp6 2607:f058:110:2::1:1:110
2607:f058:10::10:32375
? ? ? ? tcp6 2607:f058:110:2::1:1:143
2607:f058:110:2::f:0:46131
? ? ? ? tcp6 2607:f058:110:2::1:1:110
2607:f058:110:2::f:1:50671
? ? ? ? tcp6 2607:f058:110:2::1:1:143
2607:f058:110:2::f:1:4223
? ? ? ? tcp6 2607:f058:110:2::1:1:143
2607:f058:110:2::f:1:15773
? ? ? ? tcp6 2607:f058:110:2::1:1:110
2607:f058:110:2::f:0:26610
? ? ? ? tcp6 2607:f058:110:2::1:1:4190
2607:f058:110:2::f:0:38765
? ? ? ? tcp6 2607:f058:110:2::1:1:143
2607:f058:110:2::f:0:42310
? ? ? ? tcp6 2607:f058:110:2::1:1:143
2607:f058:110:2::f:1:5643
? ? ? ? tcp6 2607:f058:110:2::1:1:143
2607:f058:110:2::f:0:37143
? ? ? ? tcp6 2607:f058:110:2::1:1:4190
2607:f058:110:2::f:0:24906
[...] (agan, thousands of lines truncated)
netstat -ans -p tcp
tcp:
8497364 packets sent
3825626 data packets (484438498 bytes)
12 data packets (5560 bytes) retransmitted
1 data packet unnecessarily retransmitted
0 resends initiated by MTU discovery
4106401 ack-only packets (0 delayed)
0 URG only packets
0 window probe packets
313890 window update packets
251435 control packets
5525333 packets received
4126773 acks (for 483181992 bytes)
108333 duplicate acks
0 acks for unsent data
3787497 packets (1012422242 bytes) received in-sequence
151 completely duplicate packets (0 bytes)
0 old duplicate packets
0 packets with some dup. data (0 bytes duped)
0 out-of-order packets (0 bytes)
0 packets (0 bytes) of data after window
0 window probes
7067 window update packets
6376 packets received after close
0 discarded for bad checksums
0 discarded for bad header offset fields
0 discarded because packet too short
0 discarded due to memory problems
7546 connection requests
315652 connection accepts
0 bad connection attempts
0 listen queue overflows
73753 ignored RSTs in the windows
323196 connections established (including accepts)
323132 connections closed (including 1414 drops)
121869 connections updated cached RTT on close
121869 connections updated cached RTT variance on close
0 connections updated cached ssthresh on close
2 embryonic connections dropped
4126615 segments updated rtt (of 3726607 attempts)
12 retransmit timeouts
0 connections dropped by rexmit timeout
0 persist timeouts
0 connections dropped by persist timeout
14 Connections (fin_wait_2) dropped because of timeout
19343 keepalive timeouts
19298 keepalive probes sent
45 connections dropped by keepalive
240267 correct ACK header predictions
792401 correct data packet header predictions
315653 syncache entries added
0 retransmitted
0 dupsyn
0 dropped
315652 completed
0 bucket overflow
0 cache overflow
1 reset
0 stale
0 aborted
0 badack
0 unreach
0 zone failures
315653 cookies sent
0 cookies received
164 hostcache entries added
0 bucket overflow
0 SACK recovery episodes
0 segment rexmits in SACK recovery episodes
0 byte rexmits in SACK recovery episodes
0 SACK options (SACK blocks) received
0 SACK options (SACK blocks) sent
0 SACK scoreboard overflow
0 packets with ECN CE bit set
0 packets with ECN ECT(0) bit set
0 packets with ECN ECT(1) bit set
0 successful ECN handshakes
0 times ECN reduced the congestion window
0 packets with valid tcp-md5 signature received
0 packets with invalid tcp-md5 signature received
0 packets with tcp-md5 signature mismatch
0 packets with unexpected tcp-md5 signature received
0 packets without expected tcp-md5 signature received
If I attempt to kill and restart the processes, sometimes it works, sometimes
it doesn't and I have to end up rebooting the server.
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the freebsd-threads
mailing list