[Bug 219399] System panics after several hours of 14-threads-compilation orgies using poudriere on AMD Ryzen...

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Mon Jul 24 08:10:58 UTC 2017


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219399

--- Comment #92 from Konstantin Belousov <kib at FreeBSD.org> ---
(In reply to Don Lewis from comment #90)
Yes, the coredumping message is because the object backing the shared page
entry is only initialized with single page, so attempt to read from the second
page cannot be satisfied without the backing physical memory.

>From what I see in the amd support forums/reddit threads, the issue is not
diagnosed yet and AMD is silent about it.  Most strange thing I found was a
claim that sometimes CPU executes instructions from %rip+0x40 byte instead of
%rip.  That would explain Dillon' fix but probably have no effect on FreeBSD
trampoline layout, unless some more weirdness is in place.

If the problem indeed hardware (I hope so) and AMD will be able to identify and
fix it, I very much dislike the global change to the AMD64 native VA layout. 
My concerns are due to USRSTACK value leaking to tools and becoming part of the
ABI.  For instance, I added kern.proc.<pid>.sigtramp for the debuggers and
unwinders like libunwind to avoid using pre-defined value for the trampoline
base to detect signal frames, but some tools are not converted, and old
binaries cannot be fixed.  Similar concern for old libc' setproctitle(3).  Etc.

I suggest trying a different approach for implementing your workaround: if
matching CPU is detected, decrement sv_usrstack and sv_shared_page_base by
PAGE_SIZE.  I expect that the image activator is parametrized by struct
sysentvec enough to make this work; if not, I will fix it.  For Linux 64 bit
emul, similar adjustment for the Linux ABI sysentvec should be done at module
init.

It is shame that AMD is silent and does not provide Erratas/Notifications of
problems for their flagship CPUs.

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-bugs mailing list