read() returns ETIMEDOUT

Wed Feb 4 06:35:28 PST 2009

Hi!

> sysctl net.inet.tcp | grep keep

net.inet.tcp.keepidle: 7200000
net.inet.tcp.keepintvl: 75000
net.inet.tcp.keepinit: 75000
net.inet.tcp.always_keepalive: 1

I am using FreeBSD 7.0-STABLE on amd64 architecture and bge network interface.

Server has around 5 MB/s (megabytes) almost constant rx/tx rate. I use
pf firewall and there are a lot of connections opened, for example,
some pf states stats:

State Table                          Total             Rate
 current entries                    17042
 searches                      6752096417        14750.6/s
 inserts                         66200602          144.6/s
 removals                        66183560          144.6/s

I have been sending a TCP/IP data with netcat listening on the server
side and netcat sending from the client. It was not so fast connection
(around 50 kB/s (kilobytes) connection on average) but it was a stable
steady sending. Server has much more bandwidth available. The
connection has lasted only around 12 minutes and only 30 MB of data
has been transferred until the time the server closed the connection.

The problem is that this is repeatable (I have repeated this test many
times) and under such load it happens always. If I disable/cancel all
other load on the server the connection is not broken by the server.

> (3) TCP retransmit timer reaches its full exponntial backoff without being
>    ACK'd.  (tcp_timer_rexmt)

I believe it is because of this. I could not insert kernel printf as I
am unable to reboot the server at the moment but I have been checking
drop counters with netstat and at the moment the connection broke
"connections dropped by rexmit timeout" counter increased. It is true
that the counters are increasing almost all the time under the load
but I believe that I have timed this correctly.

> It would also be useful, if possible, to look at the tcpdump for the last
> portion of the connection, perhaps ideally from the second-to-last ACK from
> the remote host to the connection reset from the local end.  It might be
> worth running tcpdump on both sides to see if they see the same thing -- for
> example, does one side think it's sending ACKs and the other not receive it?

I have put complete logs on the net:

http://mitar.tnode.com/Temp/timeout-tcpdump-client.txt.gz
http://mitar.tnode.com/Temp/timeout-tcpdump-server.txt.gz

Client is NATed behind a router on a different ISP.

> In the previous thread, it looked a bit like the outcome was that there was
> a memory exhaustion issue under load, and that bumping nmbclusters helped at
> least defer that problem.  So it would be useful to see the output of
> netstat -m before and after (for as small an epsilon as you can make it) the
> connection is timed out.  I realize capturing the above sorts of data can be
> an issue on high-load boxes but if we can, it would be quite helpful.
> Regardless of that, knowing if you're seeing allocation errors in the
> netstat -m output would be helpful.

I doubt that it is a memory issue as I have been monitoring those
allocations and they do not come near max values. current netstat -m
output is:

10657/8228/18885 mbufs in use (current/cache/total)
8248/7388/15636/25600 mbuf clusters in use (current/cache/total/max)
8248/5994 mbuf+clusters out of packet secondary zone in use (current/cache)
1839/774/2613/12800 4k (page size) jumbo clusters in use
(current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
26857K/19929K/46786K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
27072 requests for I/O initiated by sendfile
0 calls to protocol drain routines

Mitar