ixgbe(4) spin lock held too long

John Baldwin jhb at freebsd.org
Fri Oct 24 18:47:01 UTC 2014


On Thursday, October 23, 2014 02:12:44 PM Jason Wolfe wrote:
> On Sat, Oct 18, 2014 at 4:42 AM, John Baldwin <jhb at freebsd.org> wrote:
> > On Friday, October 17, 2014 11:32:13 PM Jason Wolfe wrote:
> >> Producing 10G of random traffic against a server with this assertion
> >> added took about 2 hours to panic, so if it turns out we need anything
> >> further it should be pretty quick.
> >> 
> >> #4 list
> >> 2816                     * timer and remember to restart (more output or
> >> persist). 2817                     * If there is more data to be acked,
> >> restart retransmit 2818                     * timer, using current
> >> (possibly backed-off) value. 2819                     */
> >> 2820                    if (th->th_ack == tp->snd_max) {
> >> 2821                            tcp_timer_activate(tp, TT_REXMT, 0);
> >> 2822                            needoutput = 1;
> >> 2823                    } else if (!tcp_timer_active(tp, TT_PERSIST))
> >> 2824                            tcp_timer_activate(tp, TT_REXMT,
> >> tp->t_rxtcur);> 
> > Bah, this is just a bug in my assertion.  Rather than having a separate
> > tcp_timer_deactivate() routine, a delta of 0 passed to
> > tcp_timer_activate()
> > means "stop the timer".  My assertions were incorrect and need to exclude
> > the stop case.  Here is an updated patch (or you can just fix yours
> > locally):
> > 
> > Index: tcp_timer.c
> > ===================================================================
> > --- tcp_timer.c (revision 273219)
> > +++ tcp_timer.c (working copy)
> > @@ -869,10 +869,16 @@ tcp_timer_activate(struct tcpcb *tp, int timer_typ
> > 
> >                 case TT_REXMT:
> >                         t_callout = &tp->t_timers->tt_rexmt;
> >                         f_callout = tcp_timer_rexmt;
> > 
> > +                       if (callout_active(&tp->t_timers->tt_persist) &&
> > +                           delta != 0)
> > +                               panic("scheduling retransmit with persist
> > active");> 
> >                         break;
> >                 
> >                 case TT_PERSIST:
> >                         t_callout = &tp->t_timers->tt_persist;
> >                         f_callout = tcp_timer_persist;
> > 
> > +                       if (callout_active(&tp->t_timers->tt_rexmt) &&
> > +                           delta != 0)
> > +                               panic("scheduling persist with retransmit
> > active");> 
> >                         break;
> >                 
> >                 case TT_KEEP:
> >                         t_callout = &tp->t_timers->tt_keep;
> > 
> > --
> > John Baldwin
> 
> John,
> 
> panic: tcp_setpersist: retransmit pending
> 
> (kgdb) bt
> #0  doadump (textdump=1) at pcpu.h:219
> #1  0xffffffff806facb1 in kern_reboot (howto=260) at
> /usr/src/sys/kern/kern_shutdown.c:452
> #2  0xffffffff806fb014 in panic (fmt=<value optimized out>) at
> /usr/src/sys/kern/kern_shutdown.c:759
> #3  0xffffffff808467d3 in tcp_setpersist (tp=<value optimized out>) at
> /usr/src/sys/netinet/tcp_output.c:1619
> #4  0xffffffff8084e7b6 in tcp_timer_persist (xtp=0xfffff804ec124c00)
> at /usr/src/sys/netinet/tcp_timer.c:467
> #5  0xffffffff8070d95e in softclock_call_cc (c=0xfffff804ec124ec0,
> cc=0xffffffff81263380, direct=0)
>     at /usr/src/sys/kern/kern_timeout.c:687
> #6  0xffffffff8070dce4 in softclock (arg=<value optimized out>) at
> /usr/src/sys/kern/kern_timeout.c:816
> #7  0xffffffff806d16f3 in intr_event_execute_handlers (p=<value
> optimized out>, ie=0xfffff80015214400)
>     at /usr/src/sys/kern/kern_intr.c:1263
> #8  0xffffffff806d2056 in ithread_loop (arg=0xfffff800151f7ee0) at
> /usr/src/sys/kern/kern_intr.c:1276
> #9  0xffffffff806cf481 in fork_exit (callout=0xffffffff806d1fc0
> <ithread_loop>, arg=0xfffff800151f7ee0,
>     frame=0xfffffe1f9e9b0ac0) at /usr/src/sys/kern/kern_fork.c:996
> #10 0xffffffff80a67c0e in fork_trampoline () at
> /usr/src/sys/amd64/amd64/exception.S:606
> 
> (kgdb) frame 3
> #3  0xffffffff808467d3 in tcp_setpersist (tp=<value optimized out>) at
> /usr/src/sys/netinet/tcp_output.c:1619
> 1619                 panic("tcp_setpersist: retransmit pending");
> (kgdb) list
> 1614            int t = ((tp->t_srtt >> 2) + tp->t_rttvar) >> 1;
> 1615            int tt;
> 1616
> 1617            tp->t_flags &= ~TF_PREVVALID;
> 1618            if (tcp_timer_active(tp, TT_REXMT))
> 1619                 panic("tcp_setpersist: retransmit pending");
> 1620            /*
> 1621             * Start/restart persistance timer.
> 1622             */
> 1623            TCPT_RANGESET(tt, t * tcp_backoff[tp->t_rxtshift],
> 
> (kgdb) up
> #4  0xffffffff8084e7b6 in tcp_timer_persist (xtp=0xfffff804ec124c00)
> at /usr/src/sys/netinet/tcp_timer.c:467
> 467             tcp_setpersist(tp);
> (kgdb) list
> 462                 (ticks - tp->t_rcvtime) >= TCPTV_PERSMAX) {
> 463                  TCPSTAT_INC(tcps_persistdrop);
> 464                  tp = tcp_drop(tp, ETIMEDOUT);
> 465                  goto out;
> 466             }
> 467             tcp_setpersist(tp);
> 468             tp->t_flags |= TF_FORCEDATA;
> 469             (void) tcp_output(tp);
> 470             tp->t_flags &= ~TF_FORCEDATA;
> 
> Jason

Weird, this is the same as before.  It should have panic'd when it scheduled 
either one of the timers before this.  Can you get a stack trace from the 
other threads?  Perhaps the timers are being scheduled concurrently?

Can you also 'set print pretty' and 'p *tp'?

-- 
John Baldwin


More information about the freebsd-net mailing list