svn commit: r336313 - in head/sys: dev/bnxt dev/e1000 dev/ixgbe dev/ixl net sys
Marius Strobl
marius at freebsd.org
Sun Jul 22 18:08:23 UTC 2018
On Wed, Jul 18, 2018 at 10:33:13PM +0200, Alexander Leidinger wrote:
> Quoting Marius Strobl <marius at freebsd.org> (from Sun, 15 Jul 2018
> 19:04:23 +0000 (UTC)):
>
> > Author: marius
> > Date: Sun Jul 15 19:04:23 2018
> > New Revision: 336313
> > URL: https://svnweb.freebsd.org/changeset/base/336313
> >
> > Log:
> > Assorted TSO fixes for em(4)/iflib(9) and dead code removal:
> [...]
> > Okayed by: sbruno@ at 201806 DevSummit Transport Working Group [1]
> > Reviewed by: sbruno (earlier version), erj
> > PR: 219428 (part of; comment #10) [1], 220997 (part of; comment #3)
>
> Hi Marius,
>
> thanks a lot for this change, it improves the situation (PR 220997) a
> lot. The system is running at r336329, as such I don't have your
> change r336356 yet on the system. Maybe the 2 panics (more below) I've
> seen are fixed by this. Before I try your second change (surely not
> before the WE), here at least the report in case it is related to your
> changes and not related to r336313:
>
> I got 2 panics, both within 6 minutes (based upon the timestamp of the
> coredumps in the filesystem):
>
> 1)
> panic: Assertion ifsd_m[next] == NULL failed at /usr/src/sys/net/iflib.c:3151
> cpuid = 2
> time = 1531944124
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe008af85850
> vpanic() at vpanic+0x1a3/frame 0xfffffe008af858b0
> doadump() at doadump/frame 0xfffffe008af85930
> iflib_txq_drain() at iflib_txq_drain+0xe58/frame 0xfffffe008af85aa0
> ifmp_ring_check_drainage() at ifmp_ring_check_drainage+0x16c/frame
> 0xfffffe008af85b00
> _task_fn_tx() at _task_fn_tx+0x76/frame 0xfffffe008af85b30
> gtaskqueue_run_locked() at gtaskqueue_run_locked+0x139/frame
> 0xfffffe008af85b80
> gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0x88/frame
> 0xfffffe008af85bb0
> fork_exit() at fork_exit+0x84/frame 0xfffffe008af85bf0
> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe008af85bf0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> Uptime: 1d22h51m17s
> Dumping 2990 out of 8037 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
>
> __curthread () at ./machine/pcpu.h:230
> 230 __asm("movq %%gs:%1,%0" : "=r" (td)
> (kgdb) #0 __curthread () at ./machine/pcpu.h:230
> #1 doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:366
> #2 0xffffffff80485ea1 in kern_reboot (howto=260)
> at /usr/src/sys/kern/kern_shutdown.c:446
> #3 0xffffffff80486483 in vpanic (fmt=<optimized out>, ap=0xfffffe008af858f0)
> at /usr/src/sys/kern/kern_shutdown.c:863
> #4 0xffffffff804861f0 in kassert_panic (
> fmt=0xffffffff807e085f "Assertion %s failed at %s:%d")
> at /usr/src/sys/kern/kern_shutdown.c:749
> #5 0xffffffff8059cd78 in iflib_busdma_load_mbuf_sg (flags=0,
> txq=<optimized out>, tag=<optimized out>, map=<optimized out>,
> m0=<optimized out>, segs=<optimized out>, nsegs=<optimized out>,
> max_segs=<optimized out>) at /usr/src/sys/net/iflib.c:3151
> #6 iflib_encap (txq=0xfffff800028dc000, m_headp=0xfffffe00959bdd30)
> at /usr/src/sys/net/iflib.c:3321
> #7 iflib_txq_drain (r=0xfffffe00959ba000, cidx=<optimized out>,
> pidx=41319936) at /usr/src/sys/net/iflib.c:3636
> #8 0xffffffff805a0f4c in drain_ring_lockless (r=<optimized out>, os=...,
> prev=<optimized out>, budget=<optimized out>)
> at /usr/src/sys/net/mp_ring.c:199
> #9 ifmp_ring_check_drainage (r=<optimized out>, budget=32)
> at /usr/src/sys/net/mp_ring.c:502
> #10 0xffffffff80599c46 in _task_fn_tx (context=<optimized out>)
> at /usr/src/sys/net/iflib.c:3747
> #11 0xffffffff804cd2c9 in gtaskqueue_run_locked (queue=0xfffff800025e0d00)
> at /usr/src/sys/kern/subr_gtaskqueue.c:332
> #12 0xffffffff804cd048 in gtaskqueue_thread_loop (arg=<optimized out>)
> at /usr/src/sys/kern/subr_gtaskqueue.c:507
> #13 0xffffffff8044cc34 in fork_exit (
> callout=0xffffffff804ccfc0 <gtaskqueue_thread_loop>,
> arg=0xfffffe0007ffd038, frame=0xfffffe008af85c00)
> at /usr/src/sys/kern/kern_fork.c:1057
> (kgdb) up 5
> #5 0xffffffff8059cd78 in iflib_busdma_load_mbuf_sg (flags=0,
> txq=<optimized out>, tag=<optimized out>,
> map=<optimized out>, m0=<optimized out>, segs=<optimized out>,
> nsegs=<optimized out>, max_segs=<optimized out>)
> at /usr/src/sys/net/iflib.c:3151
> 3151 MPASS(ifsd_m[next] == NULL);
> (kgdb) list
> 3146 /*
> 3147 * see if we can't be smarter about physically
> 3148 * contiguous mappings
> 3149 */
> 3150 next = (pidx + count) & (ntxd-1);
> 3151 MPASS(ifsd_m[next] == NULL);
> 3152 #if MEMORY_LOGGING
> 3153 txq->ift_enqueued++;
> 3154 #endif
> 3155 ifsd_m[next] = m;
> (kgdb) print ifsd_m
> $1 = (struct mbuf **) 0xfffffe00959b8000
> (kgdb) print next
> $2 = <optimized out>
> (kgdb) print pidx
> $3 = 277
> (kgdb) print count
> $4 = 0
> (kgdb) print ntxd
> $5 = <optimized out>
>
>
> 2)
> Unread portion of the kernel message buffer:
> panic: Assertion ifsd_m[next] == NULL failed at /usr/src/sys/net/iflib.c:3151
> cpuid = 2
> time = 1531944550
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe008af85850
> vpanic() at vpanic+0x1a3/frame 0xfffffe008af858b0
> doadump() at doadump/frame 0xfffffe008af85930
> iflib_txq_drain() at iflib_txq_drain+0xe58/frame 0xfffffe008af85aa0
> ifmp_ring_check_drainage() at ifmp_ring_check_drainage+0x16c/frame
> 0xfffffe008af85b00
> _task_fn_tx() at _task_fn_tx+0x76/frame 0xfffffe008af85b30
> gtaskqueue_run_locked() at gtaskqueue_run_locked+0x139/frame
> 0xfffffe008af85b80
> gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0x88/frame
> 0xfffffe008af85bb0
> fork_exit() at fork_exit+0x84/frame 0xfffffe008af85bf0
> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe008af85bf0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> Uptime: 5m27s
> Dumping 1555 out of 8037 MB:..2%..11%..21%..31%..41%..51%..61%..71%..81%..91%
>
> __curthread () at ./machine/pcpu.h:230
> 230 __asm("movq %%gs:%1,%0" : "=r" (td)
> (kgdb) bt
> #0 __curthread () at ./machine/pcpu.h:230
> #1 doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:366
> #2 0xffffffff80485ea1 in kern_reboot (howto=260) at
> /usr/src/sys/kern/kern_shutdown.c:446
> #3 0xffffffff80486483 in vpanic (fmt=<optimized out>, ap=0xfffffe008af858f0)
> at /usr/src/sys/kern/kern_shutdown.c:863
> #4 0xffffffff804861f0 in kassert_panic (fmt=0xffffffff807e085f
> "Assertion %s failed at %s:%d")
> at /usr/src/sys/kern/kern_shutdown.c:749
> #5 0xffffffff8059cd78 in iflib_busdma_load_mbuf_sg (flags=0,
> txq=<optimized out>, tag=<optimized out>,
> map=<optimized out>, m0=<optimized out>, segs=<optimized out>,
> nsegs=<optimized out>, max_segs=<optimized out>)
> at /usr/src/sys/net/iflib.c:3151
> #6 iflib_encap (txq=0xfffff800028fe000, m_headp=0xfffffe00959bdde8)
> at /usr/src/sys/net/iflib.c:3321
> #7 iflib_txq_drain (r=0xfffffe00959ba000, cidx=<optimized out>,
> pidx=42948608) at /usr/src/sys/net/iflib.c:3636
> #8 0xffffffff805a0f4c in drain_ring_lockless (r=<optimized out>,
> os=..., prev=<optimized out>,
> budget=<optimized out>) at /usr/src/sys/net/mp_ring.c:199
> #9 ifmp_ring_check_drainage (r=<optimized out>, budget=32) at
> /usr/src/sys/net/mp_ring.c:502
> #10 0xffffffff80599c46 in _task_fn_tx (context=<optimized out>) at
> /usr/src/sys/net/iflib.c:3747
> #11 0xffffffff804cd2c9 in gtaskqueue_run_locked (queue=0xfffff800025a2200)
> at /usr/src/sys/kern/subr_gtaskqueue.c:332
> #12 0xffffffff804cd048 in gtaskqueue_thread_loop (arg=<optimized out>)
> at /usr/src/sys/kern/subr_gtaskqueue.c:507
> #13 0xffffffff8044cc34 in fork_exit (callout=0xffffffff804ccfc0
> <gtaskqueue_thread_loop>, arg=0xfffffe0007ffd038,
> frame=0xfffffe008af85c00) at /usr/src/sys/kern/kern_fork.c:1057
> #14 <signal handler called>
> (kgdb) up 5
> #5 0xffffffff8059cd78 in iflib_busdma_load_mbuf_sg (flags=0,
> txq=<optimized out>, tag=<optimized out>,
> map=<optimized out>, m0=<optimized out>, segs=<optimized out>,
> nsegs=<optimized out>, max_segs=<optimized out>)
> at /usr/src/sys/net/iflib.c:3151
> 3151 MPASS(ifsd_m[next] == NULL);
> (kgdb) print ifsd_m
> $1 = (struct mbuf **) 0xfffffe00959b8000
> (kgdb) print pidx
> $2 = 707
> (kgdb) print count
> $3 = 0
Hrm, so far I neither see how iflib(9) could get into that state nor
did I succeed in reproducing the panic, including not with a LEM-class
MAC. Is that an old or a new problem? If the latter, please try with
r336612. The fix in r336356 is only relevant for IGB-class devices so
doesn't apply to your machine unless the above panics are from gear
different than what PR 220997 is about.
Marius
More information about the svn-src-all
mailing list