[Bug 219399] System panics after several hours of 14-threads-compilation orgies using poudriere on AMD Ryzen...

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Thu Jul 13 23:24:43 UTC 2017


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219399

--- Comment #65 from Don Lewis <truckman at FreeBSD.org> ---
(In reply to SF from comment #63)

The motherboard I'm currently using has six Vcore VRM phases.  Basically the
top of the line for Gigabyte AM4 boards.  The only difference between this
board and the Gigabyte flagship is that this board doesn't have an adjustable
bclk.

I basically didn't see any difference between this board and the B350 board
that I was initially using.  Both crashed or locked up when doing parallel
compiles, but both survived running 16 threads of Prime95 (actually mprime on
FreeBSD because I don't have Windows).

This X370 board has problems with SMT off and half the cores disabled, so
basically only four parallel threads running.  That should hardly stress the
PSU or VRM at all and temperatures should be pretty low.  Even with everything
on, the idle temps in the BIOS look good, so I don't think it's a thermal
problem.  My last crash was early this morning, when the room temperature was a
lot lower than when the machine was running happily last evening.  There are no
VRM knobs in the Gigabyte BIOS other than voltage and LLC.  I would think those
wouldn't
be critical at 1/4 load ...

It doesn't appear to be a RAM timing problem.  Cranking the RAM speed down
basically has no effect.   ECC should be working so if a single bit error
cropped up, it should get corrected.  Memtest86 was clean, even the rowhammer
test.

The crashes seem to be fairly random.  Restarting the ports that were building
at the time of a crash is often successful.

The run that I did after upgrading to AGESA 1006 was by far the best.  With all
eight cores enabled but SMT still off, poudriere ran for a bit more than 10
hours.  As I previously mentioned three ports failed due to the jemalloc
problem, but the machine stayed up.  I restarted poudriere and those ports
built as well as a number of ports that depended on them.  The build ran for a
few hours, but the machine silently rebooted before poudriere finished.   When
I restarted poudriere, all but one of the remaining ports built.  I did see any
obvious error in the log for the failing port, but it successfully built when I
ran poudriere another time.

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-bugs mailing list