maintainer-feedback requested: [Bug 289921] graphics/drm-61-kmod: amdgpu freezes on RX 7800 XT with FreeBSD 14.3

From: <bugzilla-noreply_at_freebsd.org>
Date: Tue, 30 Sep 2025 16:30:58 UTC
Bugzilla Automation <bugzilla@FreeBSD.org> has asked freebsd-x11 (Nobody)
<x11@FreeBSD.org> for maintainer-feedback:
Bug 289921: graphics/drm-61-kmod: amdgpu freezes on RX 7800 XT with FreeBSD
14.3
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=289921



--- Description ---
On FreeBSD 14.3-RELEASE-p3 with graphics/drm-61-kmod, the system experiences
frequent GPU resets and amdgpu job timeouts on a Radeon RX 7800 XT. The problem
occurs during regular desktop usage (sway, Firefox, bunch of terminals).

System:
```
% uname -a
FreeBSD chrostik 14.3-RELEASE-p3 FreeBSD 14.3-RELEASE-p3 GENERIC amd64

% pkg info drm-61-kmod
drm-61-kmod-6.1.128.1403000_6
Name	       : drm-61-kmod
Version        : 6.1.128.1403000_6
Installed on   : Mon Sep 29 18:03:22 2025 CEST
Origin	       : graphics/drm-61-kmod
Architecture   : FreeBSD:14:amd64
Categories     : graphics kld
Maintainer     : x11@FreeBSD.org
WWW	       : https://github.com/freebsd/drm-kmod/
Comment        : DRM drivers modules
Annotations    :
	FreeBSD_version: 1403000
Flat size      : 16.9MiB
Description    :
amdgpu, i915, and radeon DRM drivers modules.
Currently corresponding to Linux 6.1 DRM.
This version is for FreeBSD 14-STABLE 1400508
and above.

% pciconf -lv | grep -B3 -A1 display
vgapci0@pci0:3:0:0:	class=0x030000 rev=0xc8 hdr=0x00 vendor=0x1002
device=0x747e subvendor=0x1da2 subdevice=0xd475
    vendor     = 'Advanced Micro Devices, Inc. [AMD/ATI]'
    device     = 'Navi 32 [Radeon RX 7700 XT / 7800 XT]'
    class      = display
    subclass   = VGA
```

dmesg:

```
[drm ERROR :amdgpu_job_timedout] ring sdma1 timeout, signaled seq=50646,
emitted seq=50648
[drm ERROR :amdgpu_job_timedout] Process information: process  pid 0 thread 
pid 0
drmn0: GPU reset begin!
drmn0: free PSP TMR buffer
drmn0: MODE1 reset
drmn0: GPU mode1 reset
drmn0: GPU smu mode1 reset
drmn0: GPU reset succeeded, trying to resume
[drm] PCIE GART of 512M enabled (table at 0x0000008000300000).
[drm] VRAM is lost due to GPU reset!
[drm] PSP is resuming...
[drm] reserve 0xa700000 from 0x83e0000000 for PSP TMR
drmn0: RAP: optional rap ta ucode is not available
drmn0: SECUREDISPLAY: securedisplay ta ucode is not available
drmn0: SMU is resuming...
drmn0: smu driver if version = 0x00000032, smu fw if version = 0x0000003f, smu
fw program = 0, smu fw version = 0x00503400 (80.52.0)
drmn0: SMU driver if version not matched
drmn0: SMU is resumed successfully!
[drm] DMUB hardware initialized: version=0x07001900
[drm] REG_WAIT timeout 1us * 1000 tries - dcn32_dsc_pg_control line:96
[drm] REG_WAIT timeout 1us * 1000 tries - dcn32_dsc_pg_control line:104
[drm] REG_WAIT timeout 1us * 1000 tries - dcn32_dsc_pg_control line:112
[drm] REG_WAIT timeout 1us * 1000 tries - dcn32_dsc_pg_control line:120
[drm] REG_WAIT timeout 1us * 1000 tries - dcn32_dsc_pg_control line:96
[drm] REG_WAIT timeout 1us * 1000 tries - dcn32_dsc_pg_control line:104
[drm] REG_WAIT timeout 1us * 1000 tries - dcn32_dsc_pg_control line:112
[drm] REG_WAIT timeout 1us * 1000 tries - dcn32_dsc_pg_control line:120
[drm] kiq ring mec 3 pipe 1 q 0
[drm] VCN decode and encode initialized successfully(under DPG Mode).
drmn0: [drm] jpeg_v4_0_hw_initdrmn0: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
drmn0: ring comp_1.0.0 uses VM inv eng 1 on hub 0
drmn0: ring comp_1.1.0 uses VM inv eng 4 on hub 0
drmn0: ring comp_1.2.0 uses VM inv eng 6 on hub 0
drmn0: ring comp_1.3.0 uses VM inv eng 7 on hub 0
drmn0: ring comp_1.0.1 uses VM inv eng 8 on hub 0
drmn0: ring comp_1.1.1 uses VM inv eng 9 on hub 0
drmn0: ring comp_1.2.1 uses VM inv eng 10 on hub 0
drmn0: ring comp_1.3.1 uses VM inv eng 11 on hub 0
drmn0: ring sdma0 uses VM inv eng 12 on hub 0
drmn0: ring sdma1 uses VM inv eng 13 on hub 0
drmn0: ring vcn_unified_0 uses VM inv eng 0 on hub 1
drmn0: ring vcn_unified_1 uses VM inv eng 1 on hub 1
drmn0: ring jpeg_dec uses VM inv eng 4 on hub 1
drmn0: ring mes_kiq_3.1.0 uses VM inv eng 14 on hub 0
drmn0: recover vram bo from shadow start
drmn0: recover vram bo from shadow done
drmn0: GPU reset(1) succeeded!
[drm ERROR :amdgpu_job_timedout] ring sdma1 timeout, signaled seq=50649,
emitted seq=50649
[drm ERROR :amdgpu_job_timedout] Process information: process  pid 0 thread 
pid 0
drmn0: GPU reset begin!
```

No overclocking, all updated. Recent UEFI. ASUS ROG STRIX X870 E-GAMING + Ryzen
7900.