[Bug 219399] System panics after several hours of 14-threads-compilation orgies using poudriere on AMD Ryzen...

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Sat Oct 7 19:27:10 UTC 2017


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219399

--- Comment #252 from Don Lewis <truckman at FreeBSD.org> ---
(In reply to SF from comment #250)
This mega-thread https://community.amd.com/thread/215773?start=0&tstart=0 on
AMD Community Forum is full of Linux users who are experiencing random
segfaults when doing parallel compiles.  Lots of experiments with different
voltage settings, RAM timing settings, and tweaking of other BIOS knobs.

AMD eventually admitted that there is a "performance marginality" issue and has
been doing warranty replacements for customers who run into this problem. 
Sometimes they request that the customer perform some experiments with various
voltage and other settings before approving the replacement, but I don't recall
seeing any success stories from that.

AMD was apparently manually screening some of the replacement CPUs before
shipping them, as evidenced by one of the seals on the replacement CPU being
cut and traces of thermal compound on the CPU.  At least in some cases AMD
performed testing with hardware identical to the the customer's.

The system crashes and hangs that I and many other FreeBSD users was caused by
the behavior of the instruction prefetch hardware near the maximum possible
user address 0x7fffffffffff.  This problem affected both FreeBSD and
DragonflyBSD.  I don't know about the other BSDs.  We implemented an acceptable
workaround in r321899.

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-bugs mailing list