[Bug 221029] AMD Ryzen: strange compilation failures using poudriere or plain buildkernel/buildworld

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Wed Jul 26 20:57:13 UTC 2017


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221029

--- Comment #13 from Nils Beyer <nbe at renzel.net> ---
(In reply to Don Lewis from comment #12)

> I'm pretty sure I saw one while doing a buildworld on ZFS as well.  I think these errors occur less frequently on ZFS.  I just did two poudriere runs with tmpfs disabled and didn't see this error.

try my buildkernel/buildworld "ryzen_stress_test.sh" script - let it run for
24h. Execute with:

    /usr/bin/nohup sh ryzen_stress_test.sh &

and hope for a "nohup.out" file like this:
-----------------------------------------------------------------------------
mkdir: /tmp/ryzen_stress_test: File exists
Wed Jul 26 19:23:09 CEST 2017 begin
Wed Jul 26 19:45:04 CEST 2017 end - errorcode 0
Wed Jul 26 19:45:04 CEST 2017 begin
Wed Jul 26 20:07:06 CEST 2017 end - errorcode 0
Wed Jul 26 20:07:06 CEST 2017 begin
Wed Jul 26 20:29:09 CEST 2017 end - errorcode 0
Wed Jul 26 20:29:09 CEST 2017 begin
Wed Jul 26 20:44:52 CEST 2017 end - errorcode 2
Wed Jul 26 20:44:52 CEST 2017 begin
Wed Jul 26 21:06:52 CEST 2017 end - errorcode 0
Wed Jul 26 21:06:52 CEST 2017 begin
Wed Jul 26 21:28:55 CEST 2017 end - errorcode 0
Wed Jul 26 21:28:55 CEST 2017 begin
Wed Jul 26 21:50:57 CEST 2017 end - errorcode 0
Wed Jul 26 21:50:57 CEST 2017 begin
Wed Jul 26 22:13:00 CEST 2017 end - errorcode 0
Wed Jul 26 22:13:00 CEST 2017 begin
Wed Jul 26 22:35:00 CEST 2017 end - errorcode 0
Wed Jul 26 22:35:00 CEST 2017 begin
-----------------------------------------------------------------------------


> My first suspicion is that this could be race condition in our code exposed by more parallelism.

I don't think so because this does happen in poudriere builds, too. These
builds are mainly single-thread builds - "kf5-kservice-5.36.0" for instance
generated that though it is single-threaded. And for buildkernel/buildworld,
this does not happen on my Intel system with the same number of threads (20)


> Which version of the share page patch are you running?

this one:
-------------------------------------------------------------------------------
Index: sys/amd64/include/vmparam.h
===================================================================
--- sys/amd64/include/vmparam.h (revision 321399)
+++ sys/amd64/include/vmparam.h (working copy)
@@ -176,7 +176,7 @@

 #define        VM_MAXUSER_ADDRESS      UVADDR(NUPML4E, 0, 0, 0)

-#define        SHAREDPAGE              (VM_MAXUSER_ADDRESS - PAGE_SIZE)
+#define        SHAREDPAGE              (VM_MAXUSER_ADDRESS - 2*PAGE_SIZE)
 #define        USRSTACK                SHAREDPAGE

 #define        VM_MAX_ADDRESS          UPT_MAX_ADDRESS
-------------------------------------------------------------------------------


> Earlier you mentioned not seeing this on the machine using the original version.

I think you mean this comment here:

    https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219399#c127

I haven't seen them yet at that time - but they appeared in a following
poudriere session...

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-bugs mailing list