11.0 stuck on high network load

Slawa Olhovchenkov slw at zxy.spb.ru
Mon Oct 10 14:29:44 UTC 2016


On Mon, Oct 10, 2016 at 04:03:39PM +0200, Julien Charbon wrote:

> 
>  Hi Slawa,
> 
> On 10/10/16 3:32 PM, Slawa Olhovchenkov wrote:
> > On Mon, Oct 10, 2016 at 01:26:12PM +0200, Julien Charbon wrote:
> >> On 10/6/16 1:10 PM, Slawa Olhovchenkov wrote:
> >>> On Thu, Oct 06, 2016 at 09:28:06AM +0200, Julien Charbon wrote:
> >>>
> >>>> 2. thread1:  In tcp_close() the inp is marked with INP_DROPPED flag, the
> >>>> process continues and calls INP_WUNLOCK() here:
> >>>>
> >>>> https://github.com/freebsd/freebsd/blob/releng/11.0/sys/netinet/tcp_subr.c#L1568
> >>>
> >>> Look also to sys/netinet/tcp_timewait.c:488
> >>>
> >>> And check other locks from r160549
> >>
> >>  You are right, and here the a fix proposal for this issue:
> >>
> >> Fix a double-free when an inp transitions to INP_TIMEWAIT state after
> >> having been dropped
> >> https://reviews.freebsd.org/D8211
> >>
> >>  It basically enforces in_pcbdrop() logic in tcp_input():  A INP_DROPPED
> >> inpcb should never be proceed further.
> >>
> >>  Slawa, as you are the only one to reproduce this issue currently, could
> >> test this patch?  (And remove the temporary patch I did provided to you
> >> before).
> >>
> >>  I will wait for your tests results before pushing further.
> >>
> >>  Thanks!
> >>
> >> diff --git a/sys/netinet/tcp_input.c b/sys/netinet/tcp_input.c
> >> index c72f01f..37f27e0 100644
> >> --- a/sys/netinet/tcp_input.c
> >> +++ b/sys/netinet/tcp_input.c
> >> @@ -921,6 +921,16 @@ findpcb:
> >>                 goto dropwithreset;
> >>         }
> >>         INP_WLOCK_ASSERT(inp);
> >> +       /*
> >> +        * While waiting for inp lock during the lookup, another thread
> >> +        * can have droppedt  the inpcb, in which case we need to loop back
> >> +        * and try to find a new inpcb to deliver to.
> >> +        */
> >> +       if (inp->inp_flags & INP_DROPPED) {
> >> +               INP_WUNLOCK(inp);
> >> +               inp = NULL;
> >> +               goto findpcb;
> > 
> > Are you sure about this goto?
> > Can this cause infinite loop by found same inpcb?
> > May be drop packet is more correct?
> 
>  Good question:  Infinite loop is not possible here, as the next TCP
> hash lookup will return NULL or a fresh new and not dropped inp.  You

I am not expert in this api and don't see cause of this: I am assume
hash lookup don't remove from hash returned args and I am don't see
any removing of this inp. Why hash lookup don't return same inp?

(assume this input patch interrupt callout code on the same CPU core).

> can check the current other usages of goto findpcb in tcp_input().  The
> rational here being:
> 
>  - Behavior before the patch:  If the inp we found was deleted then goto
> findpcb.
>  - Behavior after the patch:  If the inp we found was deleted or dropped
> then goto findpcb.
> 
>  I just prefer having the same behavior applied everywhere:  If
> tcp_input() loses the inp lock race and the inp was deleted or dropped
> then retry to find a new inpcb to deliver to.
> 
>  But you are right dropping the packet here will also fix the issue.
> 
>  Then the review process becomes quite helpful because people can argue:
>  Dropping here is better because "blah", or goto findpcb is better
> because "bluh", etc.  And at the review end you have a nice final patch.
> 
> https://reviews.freebsd.org/D8211

I am not sure, I am see to

sys/netinet/in_pcb.h:#define    INP_DROPPED             0x04000000 /* protocol drop flag */

and think this is a flag 'all packets must be droped'



More information about the freebsd-stable mailing list