mlx4en, timer irq @100%... (11.0 stuck on high network load ???)

Slawa Olhovchenkov slw at zxy.spb.ru
Tue Aug 8 11:34:03 UTC 2017


On Tue, Aug 08, 2017 at 10:31:33AM +0200, Hans Petter Selasky wrote:

> Here is the conclusion:
> 
> The following code is going in an infinite loop:
> 
> 
> >         for (;;) {
> >                 TW_RLOCK(V_tw_lock);
> >                 tw = TAILQ_FIRST(&V_twq_2msl);
> >                 if (tw == NULL || (!reuse && (tw->tw_time - ticks) > 0)) {
> >                         TW_RUNLOCK(V_tw_lock);
> >                         break;
> >                 }
> >                 KASSERT(tw->tw_inpcb != NULL, ("%s: tw->tw_inpcb == NULL",
> >                     __func__));
> > 
> >                 inp = tw->tw_inpcb;
> >                 in_pcbref(inp);
> >                 TW_RUNLOCK(V_tw_lock);
> > 
> >                 if (INP_INFO_TRY_RLOCK(&V_tcbinfo)) {
> > 
> >                         INP_WLOCK(inp);
> >                         tw = intotw(inp);
> >                         if (in_pcbrele_wlocked(inp)) {
> 
> in_pcbrele_wlocked() returns (1) because INP_FREED (16) is set in 
> inp->inp_flags2. I guess you have invariants disabled, because the 
> KASSERT() below should have caused a panic.
> 
> >                                 KASSERT(tw == NULL, ("%s: held last inp "
> >                                     "reference but tw not NULL", __func__));
> >                                 INP_INFO_RUNLOCK(&V_tcbinfo);
> >                                 continue;
> >                         }
> 
> This is a regression issue after:
> 
> > commit 5630210a7f1dbbd903b77b2aef939cd47c63da58
> > Author: jch <jch at FreeBSD.org>
> > Date:   Thu Oct 30 08:53:56 2014 +0000
> > 
> >     Fix a race condition in TCP timewait between tcp_tw_2msl_reuse() and
> >     tcp_tw_2msl_scan().  This race condition drives unplanned timewait
> >     timeout cancellation.  Also simplify implementation by holding inpcb
> >     reference and removing tcptw reference counting.
> 
> Suggested fix attached.

Hmm, I am not sure, IMHO between

TW_RUNLOCK(V_tw_lock);
and
if (INP_INFO_TRY_WLOCK(&V_tcbinfo)) {

`inp` can be invalidated, freed and this pointer may be invalid?


> Index: sys/netinet/tcp_timewait.c
> ===================================================================
> --- sys/netinet/tcp_timewait.c	(revision 321981)
> +++ sys/netinet/tcp_timewait.c	(working copy)
> @@ -709,10 +709,11 @@
>  			INP_WLOCK(inp);
>  			tw = intotw(inp);
>  			if (in_pcbrele_wlocked(inp)) {
> -				KASSERT(tw == NULL, ("%s: held last inp "
> -				    "reference but tw not NULL", __func__));
>  				INP_INFO_RUNLOCK(&V_tcbinfo);
> -				continue;
> +				if (tw == NULL)
> +					continue;
> +				else
> +					break;	/* INP_FREED flag is set */
>  			}
>  
>  			if (tw == NULL) {

> _______________________________________________
> freebsd-net at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"



More information about the freebsd-stable mailing list