Picasso AMD GPU freeze with the new g20210330 gpu firmwares

From: Ali Abdallah via freebsd-stable <freebsd-stable_at_freebsd.org>
Date: Fri, 06 Aug 2021 15:52:52 UTC
Hi,

I recently switched my port tree from 2021Q2 to 2021Q3, after updating
and rebooting my FreeBSD 13.0 system, I started to notice random system
freeze, I can ssh to the frozen system, and from dmesg I see:

---
Aug  4 08:58:51 Fryzen495 kernel: drmn0: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process  pid 100349 thread  pid 100349)
Aug  4 08:58:51 Fryzen495 kernel: drmn0:   in page starting at address 0x000080012c3f0000 from client 27
Aug  4 08:58:51 Fryzen495 kernel: drmn0: VM_L2_PROTECTION_FAULT_STATUS:0x00141051
Aug  4 08:58:51 Fryzen495 kernel: drmn0: 	 MORE_FAULTS: 0x1
Aug  4 08:58:51 Fryzen495 kernel: drmn0: 	 WALKER_ERROR: 0x0
Aug  4 08:58:51 Fryzen495 kernel: drmn0: 	 PERMISSION_FAULTS: 0x5
Aug  4 08:58:51 Fryzen495 kernel: drmn0: 	 MAPPING_ERROR: 0x0
Aug  4 08:58:51 Fryzen495 kernel: drmn0: 	 RW: 0x1
---

The only thing seemed relevent for me between 2021Q2 and 2021Q3 is the
newer GPU firmware g20210330 versus g20210224. I downgraged to
g20210224, rebooted the system, and it is running stable as before.

I found a similar issue here:

https://githubmemory.com/repo/freebsd/drm-kmod/issues/78

But didn't try the mentioned work-around. My system is always connected
to two external monitors using a USB-C dock, was running for almost 4
months without a reboot with the old firmwares.

Shall I open a bug report? I hope this mail helps someone.

Regards,
Ali.