[Bug 219399] System panics after several hours of 14-threads-compilation orgies using poudriere on AMD Ryzen...

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Fri Jul 28 11:19:56 UTC 2017


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219399

--- Comment #182 from Nils Beyer <nbe at renzel.net> ---
In order to track these compilation errors, I did what AMD support requested:
cleared CMOS by removing all cables and the battery and set VCORE staticially
to 1.36250V

Then I started a new, fresh poudriere run.

And guess what, after 1733 built ports (1 failed - "ghc"), my system paniced:
------------------------------------------------------------------------------
root at asbach:/var/crash/#kgdb -c vmcore.0
/usr/lib/debug/boot/kernel/kernel.debug 
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:
spin lock 0xffffffff81dc8b50 (smp rendezvous) held by 0xfffff801325ea560 (tid
102081) too long
timeout stopping cpus
panic: spin lock held too long
cpuid = 6
KDB: stack backtrace:
#0 0xffffffff80aada97 at kdb_backtrace+0x67
#1 0xffffffff80a6bb76 at vpanic+0x186
#2 0xffffffff80a6b9e3 at panic+0x43
#3 0xffffffff80a4cf71 at _mtx_lock_spin_cookie+0x311
#4 0xffffffff81042dc1 at smp_targeted_tlb_shootdown+0x101
#5 0xffffffff81042cac at smp_masked_invltlb+0x4c
#6 0xffffffff80eced91 at pmap_invalidate_all+0x211
#7 0xffffffff80ed936a at pmap_advise+0x49a
#8 0xffffffff80d60c26 at vm_map_madvise+0x2c6
#9 0xffffffff80d6534e at sys_madvise+0x7e
#10 0xffffffff80ee0394 at amd64_syscall+0x6c4
#11 0xffffffff80ec392b at Xfast_syscall+0xfb
Uptime: 4h4m31s
Dumping 5426 out of 32665 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

Reading symbols from /usr/lib/debug/boot/kernel/zfs.ko.debug...done.
Loaded symbols for /usr/lib/debug/boot/kernel/zfs.ko.debug
Reading symbols from /usr/lib/debug/boot/kernel/opensolaris.ko.debug...done.
Loaded symbols for /usr/lib/debug/boot/kernel/opensolaris.ko.debug
Reading symbols from /usr/lib/debug/boot/kernel/linprocfs.ko.debug...done.
Loaded symbols for /usr/lib/debug/boot/kernel/linprocfs.ko.debug
Reading symbols from /usr/lib/debug/boot/kernel/linux_common.ko.debug...done.
Loaded symbols for /usr/lib/debug/boot/kernel/linux_common.ko.debug
Reading symbols from /usr/lib/debug/boot/kernel/tmpfs.ko.debug...done.
Loaded symbols for /usr/lib/debug/boot/kernel/tmpfs.ko.debug
Reading symbols from /usr/lib/debug/boot/kernel/vmm.ko.debug...done.
Loaded symbols for /usr/lib/debug/boot/kernel/vmm.ko.debug
Reading symbols from /usr/lib/debug/boot/kernel/ums.ko.debug...done.
Loaded symbols for /usr/lib/debug/boot/kernel/ums.ko.debug
Reading symbols from /usr/lib/debug/boot/kernel/pflog.ko.debug...done.
Loaded symbols for /usr/lib/debug/boot/kernel/pflog.ko.debug
Reading symbols from /usr/lib/debug/boot/kernel/pf.ko.debug...done.
Loaded symbols for /usr/lib/debug/boot/kernel/pf.ko.debug
Reading symbols from /usr/lib/debug/boot/kernel/linux.ko.debug...done.
Loaded symbols for /usr/lib/debug/boot/kernel/linux.ko.debug
Reading symbols from /usr/lib/debug/boot/kernel/linux64.ko.debug...done.
Loaded symbols for /usr/lib/debug/boot/kernel/linux64.ko.debug
Reading symbols from /usr/lib/debug/boot/kernel/nullfs.ko.debug...done.
Loaded symbols for /usr/lib/debug/boot/kernel/nullfs.ko.debug
Reading symbols from /usr/lib/debug/boot/kernel/fdescfs.ko.debug...done.
Loaded symbols for /usr/lib/debug/boot/kernel/fdescfs.ko.debug
#0  doadump (textdump=<value optimized out>) at pcpu.h:222
222     pcpu.h: No such file or directory.
        in pcpu.h
(kgdb) bt
#0  doadump (textdump=<value optimized out>) at pcpu.h:222
#1  0xffffffff80a6b6f1 in kern_reboot (howto=260) at
/usr/src/sys/kern/kern_shutdown.c:366
#2  0xffffffff80a6bbb0 in vpanic (fmt=<value optimized out>, ap=<value
optimized out>) at /usr/src/sys/kern/kern_shutdown.c:759
#3  0xffffffff80a6b9e3 in panic (fmt=<value optimized out>) at
/usr/src/sys/kern/kern_shutdown.c:690
#4  0xffffffff80a4cf71 in _mtx_lock_spin_cookie (c=<value optimized out>,
v=<value optimized out>, tid=18446735289348100096, opts=<value optimized out>, 
    file=<value optimized out>, line=<value optimized out>) at
/usr/src/sys/kern/kern_mutex.c:672
#5  0xffffffff81042dc1 in smp_targeted_tlb_shootdown (mask={__bits =
0xfffffe085f03b780}, vector=244, pmap=<value optimized out>, addr1=<value
optimized out>, addr2=0)
    at /usr/src/sys/x86/x86/mp_x86.c:1470
#6  0xffffffff81042cac in smp_masked_invltlb (mask={__bits =
0xfffffe085f03b7b0}, pmap=<value optimized out>) at
/usr/src/sys/x86/x86/mp_x86.c:1504
#7  0xffffffff80eced91 in pmap_invalidate_all (pmap=0xfffff8017f9ff138) at
/usr/src/sys/amd64/amd64/pmap.c:1662
#8  0xffffffff80ed936a in pmap_advise (pmap=<value optimized out>,
sva=35436597248, eva=35436597248, advice=5) at
/usr/src/sys/amd64/amd64/pmap.c:6189
#9  0xffffffff80d60c26 in vm_map_madvise (map=<value optimized out>,
start=35436552192, end=35436597248, behav=<value optimized out>) at
/usr/src/sys/vm/vm_map.c:2291
#10 0xffffffff80d6534e in sys_madvise (td=<value optimized out>, uap=<value
optimized out>) at /usr/src/sys/vm/vm_mmap.c:705
#11 0xffffffff80ee0394 in amd64_syscall (td=0xfffff802bb419000, traced=0) at
subr_syscall.c:135
#12 0xffffffff80ec392b in Xfast_syscall () at
/usr/src/sys/amd64/amd64/exception.S:396
#13 0x00000008020502fa in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language:  auto; currently minimal
------------------------------------------------------------------------------

I raised the voltage by 0.05V to 1.41250V as suggested by AMD tech support. And
will try another fresh poudriere run now.

At least, that panic is something new - is that something caused by flawky CPU
or a software bug?

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-bugs mailing list