[Bug 219399] System panics after several hours of 14-threads-compilation orgies using poudriere on AMD Ryzen...

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Tue Jul 25 13:30:59 UTC 2017


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219399

--- Comment #121 from Nils Beyer <nbe at renzel.net> ---
(In reply to Don Lewis from comment #119)

> This gives mfence() some memory loads to wait for, which allows the data to be migrated from the core A cache.  With this change, I no longer get any segfaults.

confirmed - with that change, I haven't gotten any segfaults in 500 passes.
Though, there is a discrepancy in how many passes each core has absolved:
---------------------------------------------------------------------------
[...]
412: Tue Jul 25 15:19:00 CEST 2017: OK
405: Tue Jul 25 15:19:01 CEST 2017: OK
402: Tue Jul 25 15:19:01 CEST 2017: OK
420: Tue Jul 25 15:19:01 CEST 2017: OK
410: Tue Jul 25 15:19:01 CEST 2017: OK
406: Tue Jul 25 15:19:01 CEST 2017: OK
410: Tue Jul 25 15:19:01 CEST 2017: OK
414: Tue Jul 25 15:19:01 CEST 2017: OK
410: Tue Jul 25 15:19:01 CEST 2017: OK
409: Tue Jul 25 15:19:02 CEST 2017: OK
413: Tue Jul 25 15:19:02 CEST 2017: OK
423: Tue Jul 25 15:19:02 CEST 2017: OK
397: Tue Jul 25 15:19:02 CEST 2017: OK
411: Tue Jul 25 15:19:02 CEST 2017: OK
401: Tue Jul 25 15:19:02 CEST 2017: OK
421: Tue Jul 25 15:19:02 CEST 2017: OK
438: Tue Jul 25 15:19:02 CEST 2017: OK
427: Tue Jul 25 15:19:02 CEST 2017: OK
406: Tue Jul 25 15:19:02 CEST 2017: OK
---------------------------------------------------------------------------

In my eyes, each core is performing the same workload and should therefore be
at the same pass number. Maybe I'm completely wrong. But isn't that something
you've observed, too, is it?


> Ryzen bug?  Just more aggressive prefetching?  I don't know ...

It's a rather difficult question: if CPU A executes something without
segfaults; and CPU B throws segfaults using the same executable, does that
automatically mean that CPU B is doing it all wrongly? Or does it rather mean
CPU B is not 100% compatible to CPU A and therefore needs an appropiate
executable?

I ask because I wonder if that's something that should be told to AMD tech
support - particularly because I have an open ticket there...

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-bugs mailing list