[Bug 219399] System panics after several hours of 14-threads-compilation orgies using poudriere on AMD Ryzen...

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Thu Aug 3 04:46:22 UTC 2017


--- Comment #202 from Don Lewis <truckman at FreeBSD.org> ---
(In reply to Nils Beyer from comment #200)
I believe so.  It's pretty unlikely that the problem is caused by undefined
opcodes, and we are not seeing any evidence (SIGILL) of valid instructions
being trapped as invalid because they experience page faults mid-fetch.

BTW, using either my origin workaround patch, or the committed version if the
sv_maxuser adjustment is commented out, it is possible to use a user process to
mmap() the top page of user memory, load some code up there, and execute it for
testing purposes.  I've done some experiments with that and it is possible to
quickly hang the machine or cause it to reboot.  The interesting thing is that
I haven't observed any ill effects as long as no instructions are executed
above 0x7fffffffff40.  That's sort of in the area mentioned in the Dragonfly
fix, but even they saw issues at addresses lower than that and a decreasing
rate as the address was lowered.  Our signal trampoline code was much closer to
the bottom of the page at 0x7ffffffff000, so at this point I don't know why we
were having problems.  The only thing that I can think of is that the signal
trampoline code uses some unusual instructions like syscall and hlt, which are
unlike the more vanilla instructions that I was using in my experiments.

You are receiving this mail because:
You are the assignee for the bug.

More information about the freebsd-bugs mailing list