read() returns ETIMEDOUT on steady TCP connection

Tim Gebbett tim at gebbettco.com
Thu May 8 18:26:55 UTC 2008


Hi Andre,

Applied the patch, I could not see anything different to the last test. No
packet filtering or NAT are enabled, the test is running over a switch.

Many thanks - Tim

258/6657/6915 mbufs in use (current/cache/total)
256/1084/1340/25600 mbuf clusters in use (current/cache/total/max)
256/1024 mbuf+clusters out of packet secondary zone in use (current/cache)
0/3565/3565/51200 4k (page size) jumbo clusters in use
(current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
576K/18092K/18668K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/5/6656 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines



tcp:
        2535986 packets sent
                2331504 data packets (3266105457 bytes)
                258369 data packets (371873800 bytes) retransmitted
                46685 data packets unnecessarily retransmitted
                0 resends initiated by MTU discovery
                27972 ack-only packets (3905 delayed)
                0 URG only packets
                0 window probe packets
                10636 window update packets
                4 control packets
        1888423 packets received
                1172093 acks (for 3256670917 bytes)
                425512 duplicate acks
                0 acks for unsent data
                46363 packets (52992771 bytes) received in-sequence
                19 completely duplicate packets (17508 bytes)
                0 old duplicate packets
                0 packets with some dup. data (0 bytes duped)
                293 out-of-order packets (376674 bytes)
                0 packets (0 bytes) of data after window
                0 window probes
                242695 window update packets
                0 packets received after close
                0 discarded for bad checksums
                0 discarded for bad header offset fields
                0 discarded because packet too short
                0 discarded due to memory problems
        2 connection requests
        2054 connection accepts
        0 bad connection attempts
        0 listen queue overflows
        597 ignored RSTs in the windows
        2056 connections established (including accepts)
        2052 connections closed (including 2049 drops)
                2048 connections updated cached RTT on close
                2048 connections updated cached RTT variance on close
                2048 connections updated cached ssthresh on close
        0 embryonic connections dropped
        1172093 segments updated rtt (of 1057825 attempts)
        72399 retransmit timeouts
                1 connection dropped by rexmit timeout
        0 persist timeouts
                0 connections dropped by persist timeout
        0 Connections (fin_wait_2) dropped because of timeout
        0 keepalive timeouts
                0 keepalive probes sent
                0 connections dropped by keepalive
        4806 correct ACK header predictions
        43111 correct data packet header predictions
        2054 syncache entries added
                0 retransmitted
                0 dupsyn
                0 dropped
                2054 completed
                0 bucket overflow
                0 cache overflow
                0 reset
                0 stale
                0 aborted
                0 badack
                0 unreach
                0 zone failures
        2054 cookies sent
        0 cookies received
        83801 SACK recovery episodes
        116632 segment rexmits in SACK recovery episodes
        168710294 byte rexmits in SACK recovery episodes
        544885 SACK options (SACK blocks) received
        14 SACK options (SACK blocks) sent
        0 SACK scoreboard overflow
udp:
        31 datagrams received
        0 with incomplete header
        0 with bad data length field
        0 with bad checksum
        0 with no checksum
        0 dropped due to no socket
        10 broadcast/multicast datagrams undelivered
        0 dropped due to full socket buffers
        0 not for hashed pcb
        21 delivered
        22 datagrams output
        0 times multicast source filter matched
sctp:
        0 input packets
                0 datagrams
                0 packets that had data
                0 input SACK chunks
                0 input DATA chunks
                0 duplicate DATA chunks
                0 input HB chunks
                0 HB-ACK chunks
                0 input ECNE chunks
                0 input AUTH chunks
                0 chunks missing AUTH
                0 invalid HMAC ids received
                0 invalid secret ids received
                0 auth failed
                0 fast path receives all one chunk
                0 fast path multi-part data
        0 output packets
                0 output SACKs
                0 output DATA chunks
                0 retransmitted DATA chunks
                0 fast retransmitted DATA chunks
                0 FR's that happened more than once to same chunk.
                0 intput HB chunks
                0 output ECNE chunks
                0 output AUTH chunks
                0 ip_output error counter
        Packet drop statistics:
                0 from middle box
                0 from end host
                0 with data
                0 non-data, non-endhost
                0 non-endhost, bandwidth rep only
                0 not enough for chunk header
                0 not enough data to confirm
                0 where process_chunk_drop said break
                0 failed to find TSN
                0 attempt reverse TSN lookup
                0 e-host confirms zero-rwnd
                0 midbox confirms no space
                0 data did not match TSN
                0 TSN's marked for Fast Retran
        Timeouts:
                0 iterator timers fired
                0 T3 data time outs
                0 window probe (T3) timers fired
                0 INIT timers fired
                0 sack timers fired
                0 shutdown timers fired
                0 heartbeat timers fired
                0 a cookie timeout fired
                0 an endpoint changed its cookiesecret
                0 PMTU timers fired
                0 shutdown ack timers fired
                0 shutdown guard timers fired
                0 stream reset timers fired
                0 early FR timers fired
                0 an asconf timer fired
                0 auto close timer fired
                0 asoc free timers expired
                0 inp free timers expired
        0 packet shorter than header
        0 checksum error
        0 no endpoint for port
        0 bad v-tag
        0 bad SID
        0 no memory
        0 number of multiple FR in a RTT window
        0 RFC813 allowed sending
        0 RFC813 does not allow sending
        0 max burst dosn't allow sending
        0 look ahead tells us no memory in interface
        0 numbers of window probes sent
        0 times an output error to clamp down on next user send.
        0 times sctp_senderrors were caused from a user
        0 number of in data drops due to chunk limit reached
        0 number of in data drops due to rwnd limit reached
        0 times a ECN reduced the cwnd
        0 used express lookup via vtag
        0 collision in express lookup.
        0 times the sender ran dry of user data on primary
        0 same for above
        0 sacks the slow way
        0 window update only sacks sent
        0 sends with sinfo_flags !=0
        0 unordered sends
        0 sends with EOF flag set
        0 sends with ABORT flag set
        0 times protocol drain called
        0 times we did a protocol drain
        0 times recv was called with peek
        0 cached chunks used
        0 cached stream oq's used
        0 unread messages abandonded by close
        0 send burst avoidance, already max burst inflight to net
        0 send cwnd full avoidance, already max burst inflight to net
        0 number of map array over-runs via fwd-tsn's
ip:
        1888457 total packets received
        0 bad header checksums
        0 with size smaller than minimum
        0 with data size < data length
        0 with ip length > max ip packet size
        0 with header length < data size
        0 with data length < header length
        0 with bad options
        0 with incorrect version number
        0 fragments received
        0 fragments dropped (dup or out of space)
        0 fragments dropped after timeout
        0 packets reassembled ok
        1888454 packets for this host
        0 packets for unknown/unsupported protocol
        0 packets forwarded (0 packets fast forwarded)
        3 packets not forwardable
        0 packets received for unknown multicast group
        0 redirects sent
        2628572 packets sent from this host
        0 packets sent with fabricated ip header
        0 output packets dropped due to no bufs, etc.
        0 output packets discarded due to no route
        0 output datagrams fragmented
        0 fragments created
        0 datagrams that can't be fragmented
        0 tunneling packets that can't find gif
        0 datagrams with bad address in header

On Thu, 08 May 2008 18:06:17 +0200, Andre Oppermann <andre at freebsd.org> wrote:
> Hi Tim,
> 
> looking at the ip_output() path there are some places that can
> return ENOBUFS:
> 
>   a) interface queue length check
> 
>   b) packet filter
> 
>   c) destination address rewrite through NAT
> 
>   d) if_output() call
> 
>   e) IP fragmentation if DF was not set
> 
> The first one of those is the most likely to be the source of the
> error.  The output interface queue length in read unlocked and may
> be a stale value on an SMP machine.  Further down in ether_output()
> there are some further possibilities for ENOBUFS errors.  But lets
> concentrate on a) first.
> 
> For testing purposes please apply the following patch to ip_output():
> 
> -------------------------------------------------------------------
> cvs diff -up ip_output.c
> Index: ip_output.c
> ===================================================================
> RCS file: /home/ncvs/src/sys/netinet/ip_output.c,v
> retrieving revision 1.276.2.1
> diff -u -p -r1.276.2.1 ip_output.c
> --- ip_output.c 9 Mar 2008 21:04:54 -0000       1.276.2.1
> +++ ip_output.c 8 May 2008 16:02:32 -0000
> @@ -370,7 +370,7 @@ again:
>                          ip->ip_src = IA_SIN(ia)->sin_addr;
>                  }
>          }
> -
> +#if 0
>          /*
>           * Verify that we have any chance at all of being able to queue
> the
>           * packet or packet fragments, unless ALTQ is enabled on the
> given
> @@ -390,7 +390,7 @@ again:
>                  ifp->if_snd.ifq_drops += (ip->ip_len / ifp->if_mtu + 1);
>                  goto bad;
>          }
> -
> +#endif
>          /*
>           * Look for broadcast address and
>           * verify user is allowed to send
> -------------------------------------------------------------------
> 
> If there is a real interface output queue full event the IFQ_HANDOFF()
> and IFQ_ENQUEUE() macros will report it too.  Then we can focus on the
> interface queues.
> 
> --
> Andre
> 
> Tim Gebbett wrote:
>> Hi all,
>>
>> applied the patch,
>>
>> Well before a ETIMEDOUT error occurred (around 60secs), the tcp debug
> started venting massive quantities of tcp_output error 55 while sending
> with syncache noise:
>>
>>
>> y  8 12:14:26 timtest kernel: :63859 to [192.168.5.40]:80; tcp_output:
> error 55 whilTeC Ps:e n[d1i9n2g. 1(6i8p._5o.u4t3p]u:t64 0371 )t
>> May  8 12:14:26 timtest kernel: o
>> May  8 12:14:26 timtest kernel: [192.168.5.40]:80; tcp_output: error 55
> while sendingT
>> May  8 12:14:26 timtest kernel: C
>> May  8 12:14:26 timtest kernel: P: [192.168.5.43]:63859 to
> [192.168.5.40]:80; tcp_output: error 55 while sending
>> May  8 12:14:26 timtest kernel: TCP: [192.168.5.43]:64037 to
> [192.168.5.40]:80; tcp_output: error 55 while sending (ip_output 1)
>> May  8 12:14:26 timtest kernel: TCP: [192.168.5.43]:64037 to
> [192.168.5.40]:80; tcp_output: error 55 while sending
>> May  8 12:14:26 timtest kernel: TCP: [192.168.5.43]:63857 to
> [192.168.5.40]:80; tcp_output: error 55 while sending (ip_output 1)
>> May  8 12:14:26 timtest kernel: TCP: [192.168.5.42]:56421 toT C[P1:9
> 2[.119628..51.6480.]5:.8403;] :6t3c8p57_ otuot p[u1t9:2 .e1r68r.o5r.
> 40]:8505;  whticlpe_ osuetnpduitn:g  e(rirpo_ro utpu5t5  w1h)i
>>
>> interspersed with clean blocks of 20 entries or so of:
>>
>> May  8 12:14:26 timtest kernel: TCP: [192.168.5.43]:64037 to
> [192.168.5.40]:80; tcp_output: error 55 while sending (ip_output 1)
>> May  8 12:14:26 timtest kernel: TCP: [192.168.5.43]:64037 to
> [192.168.5.40]:80; tcp_output: error 55 while sending
>> May  8 12:14:26 timtest kernel: TCP: [192.168.5.43]:63857 to
> [192.168.5.40]:80; tcp_output: error 55 while sending (ip_output 1)
>>
>>
>> The output did not look appreciably different when the ETIMEDOUT
> occurred.
>>
>> On stopping the client test program:
>>
>> May  8 12:14:46 timtest kernel: TCP: [192.168.5.43]:63978 to
> [192.168.5.40]:80 tcpflags 0x4<RST>; syncache_chkrst: Spurious RST without
> matching syncache entry (possibly syncookie only), segment ignored
>> May  8 12:14:46 timtest kernel: TCP: [192.168.5.43]:63978 to
> [192.168.5.40]:80 tcpflags 0x4<RST>; syncache_chkrst: Spurious RST without
> matching syncache entry (possibly syncookie only), segment ignored
>> May  8 12:14:46 timtest kernel: TCP: [192.168.5.43]:63978 to
> [192.168.5.40]:80 tcpflags 0x4<RST>; syncache_chkrst: Spurious RST without
> matching syncache entry (possibly syncookie only), segment ignored
>>
>> netstat -m
>>
>> 258/11007/11265 mbufs in use (current/cache/total)
>> 256/1596/1852/25600 mbuf clusters in use (current/cache/total/max)
>> 256/1536 mbuf+clusters out of packet secondary zone in use
> (current/cache)
>> 0/7585/7585/51200 4k (page size) jumbo clusters in use
> (current/cache/total/max)
>> 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
>> 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
>> 576K/36283K/36860K bytes allocated to network (current/cache/total)
>> 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
>> 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
>> 0/4/6656 sfbufs in use (current/peak/max)
>> 0 requests for sfbufs denied
>> 0 requests for sfbufs delayed
>> 0 requests for I/O initiated by sendfile
>> 0 calls to protocol drain routines
>>
>> Thanks again for your help - Tim
>>
>>
>>
>>
>>
>>
>>
>>
>>
> 
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"



More information about the freebsd-net mailing list