[Bug 257786] AMD GPU freeze with the new g20210330 gpu firmwares

From: <bugzilla-noreply_at_freebsd.org>
Date: Thu, 12 Aug 2021 14:09:00 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=257786

            Bug ID: 257786
           Summary: AMD GPU freeze with the new g20210330 gpu firmwares
           Product: Ports & Packages
           Version: Latest
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: Individual Port(s)
          Assignee: ports-bugs@FreeBSD.org
          Reporter: ali.abdallah@suse.com

I recently switched my port tree from 2021Q2 to 2021Q3, after updating
and rebooting my FreeBSD 13.0 system, I started to notice random system
freeze, I can ssh to the frozen system, and from dmesg I see:

---
Aug  4 08:58:51 Fryzen495 kernel: drmn0: [gfxhub0] retry page fault (src_id:0
ring:0 vmid:1 pasid:32769, for process  pid 100349 thread  pid 100349)
Aug  4 08:58:51 Fryzen495 kernel: drmn0:   in page starting at address
0x000080012c3f0000 from client 27
Aug  4 08:58:51 Fryzen495 kernel: drmn0:
VM_L2_PROTECTION_FAULT_STATUS:0x00141051
Aug  4 08:58:51 Fryzen495 kernel: drmn0:      MORE_FAULTS: 0x1
Aug  4 08:58:51 Fryzen495 kernel: drmn0:      WALKER_ERROR: 0x0
Aug  4 08:58:51 Fryzen495 kernel: drmn0:      PERMISSION_FAULTS: 0x5
Aug  4 08:58:51 Fryzen495 kernel: drmn0:      MAPPING_ERROR: 0x0
Aug  4 08:58:51 Fryzen495 kernel: drmn0:      RW: 0x1
---

The only thing seemed relevant for me between 2021Q2 and 2021Q3 is the
newer GPU firmware g20210330 versus g20210224. I downgraded to
g20210224, rebooted the system, and it is running stable as before.

My system is a Thinkpad T495 with Picasso GPU. Please don't hesitate to ask for
more information.

-- 
You are receiving this mail because:
You are the assignee for the bug.