[Bug 267028] General protection fault kernel panic immediately after kldload amdgpu

From: <bugzilla-noreply_at_freebsd.org>
Date: Sun, 06 Nov 2022 18:15:41 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=267028

--- Comment #13 from George Mitchell <george@m5p.com> ---
As of today, with version drm-510-kmod-5.10.113_8:

1. I can reliably prevent a crash by booting to single user mode, manually
kldloading amdgpu, and continuing (typing control-d).  dmesg then reports:

[drm] amdgpu kernel modesetting enabled.
drmn0: <drmn> on vgapci0
vgapci0: child drmn0 requested pci_enable_io
vgapci0: child drmn0 requested pci_enable_io
[drm] initializing kernel modesetting (RAVEN 0x1002:0x15DD 0x1458:0xD000 0xC8).
drmn0: Trusted Memory Zone (TMZ) feature disabled as experimental (default)
[drm] register mmio base: 0xFE600000
[drm] register mmio size: 524288
[drm] add ip block number 0 <soc15_common>
[drm] add ip block number 1 <gmc_v9_0>
[drm] add ip block number 2 <vega10_ih>
[drm] add ip block number 3 <psp>
[drm] add ip block number 4 <gfx_v9_0>
[drm] add ip block number 5 <sdma_v4_0>
[drm] add ip block number 6 <powerplay>
[drm] add ip block number 7 <dm>
[drm] add ip block number 8 <vcn_v1_0>
drmn0: successfully loaded firmware image 'amdgpu/raven_gpu_info.bin'
[drm] BIOS signature incorrect 44 f
drmn0: Fetched VBIOS from ROM BAR
amdgpu: ATOM BIOS: 113-RAVEN-111
drmn0: successfully loaded firmware image 'amdgpu/raven_sdma.bin'
[drm] VCN decode is enabled in VM mode
[drm] VCN encode is enabled in VM mode
[drm] JPEG decode is enabled in VM mode
[drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is
9-bit
drmn0: VRAM: 2048M 0x000000F400000000 - 0x000000F47FFFFFFF (2048M used)
drmn0: GART: 1024M 0x0000000000000000 - 0x000000003FFFFFFF
drmn0: AGP: 267419648M 0x000000F800000000 - 0x0000FFFFFFFFFFFF
[drm] Detected VRAM RAM=2048M, BAR=2048M
[drm] RAM width 128bits DDR4
[TTM] Zone  kernel: Available graphics memory: 3100774 KiB
[TTM] Zone   dma32: Available graphics memory: 2097152 KiB
[TTM] Initializing pool allocator
[drm] amdgpu: 2048M of VRAM memory ready
[drm] amdgpu: 3072M of GTT memory ready.
[drm] GART: num cpu pages 262144, num gpu pages 262144
[drm] PCIE GART of 1024M enabled (table at 0x000000F400900000).
drmn0: successfully loaded firmware image 'amdgpu/raven_asd.bin'
drmn0: successfully loaded firmware image 'amdgpu/raven_ta.bin'
drmn0: successfully loaded firmware image 'amdgpu/raven_pfp.bin'
drmn0: successfully loaded firmware image 'amdgpu/raven_me.bin'
drmn0: successfully loaded firmware image 'amdgpu/raven_ce.bin'
drmn0: successfully loaded firmware image 'amdgpu/raven_rlc.bin'
drmn0: successfully loaded firmware image 'amdgpu/raven_mec.bin'
drmn0: successfully loaded firmware image 'amdgpu/raven_mec2.bin'
amdgpu: hwmgr_sw_init smu backed is smu10_smu
drmn0: successfully loaded firmware image 'amdgpu/raven_vcn.bin'
[drm] Found VCN firmware Version ENC: 1.12 DEC: 2 VEP: 0 Revision: 1
drmn0: Will use PSP to load VCN firmware
[drm] reserve 0x400000 from 0xf47fc00000 for PSP TMR
drmn0: RAS: optional ras ta ucode is not available
drmn0: RAP: optional rap ta ucode is not available
[drm] kiq ring mec 2 pipe 1 q 0
[drm] DM_PPLIB: values for F clock
[drm] DM_PPLIB:  400000 in kHz, 3649 in mV
[drm] DM_PPLIB:  933000 in kHz, 4074 in mV
[drm] DM_PPLIB:  1200000 in kHz, 4399 in mV
[drm] DM_PPLIB:  1333000 in kHz, 4399 in mV
[drm] DM_PPLIB: values for DCF clock
[drm] DM_PPLIB:  300000 in kHz, 3649 in mV
[drm] DM_PPLIB:  600000 in kHz, 4074 in mV
[drm] DM_PPLIB:  626000 in kHz, 4250 in mV
[drm] DM_PPLIB:  654000 in kHz, 4399 in mV
[drm] Display Core initialized with v3.2.104!
[drm] VCN decode and encode initialized successfully(under SPG Mode).
drmn0: SE 1, SH per SE 1, CU per SH 11, active_cu_number 8
[drm] fb mappable at 0x60BCA000
[drm] vram apper at 0x60000000
[drm] size 8294400
[drm] fb depth is 24
[drm]    pitch is 7680
VT: Replacing driver "vga" with new "fb".
start FB_INFO:
type=11 height=1080 width=1920 depth=32
pbase=0x60bca000 vbase=0xfffff80060bca000
name=drmn0 flags=0x0 stride=7680 bpp=32
end FB_INFO
drmn0: ring gfx uses VM inv eng 0 on hub 0
drmn0: ring comp_1.0.0 uses VM inv eng 1 on hub 0
drmn0: ring comp_1.1.0 uses VM inv eng 4 on hub 0
drmn0: ring comp_1.2.0 uses VM inv eng 5 on hub 0
drmn0: ring comp_1.3.0 uses VM inv eng 6 on hub 0
drmn0: ring comp_1.0.1 uses VM inv eng 7 on hub 0
drmn0: ring comp_1.1.1 uses VM inv eng 8 on hub 0
drmn0: ring comp_1.2.1 uses VM inv eng 9 on hub 0
drmn0: ring comp_1.3.1 uses VM inv eng 10 on hub 0
drmn0: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
drmn0: ring sdma0 uses VM inv eng 0 on hub 1
drmn0: ring sdma0 uses VM inv eng 0 on hub 1
drmn0: ring vcn_dec uses VM inv eng 1 on hub 1
drmn0: ring vcn_enc0 uses VM inv eng 4 on hub 1
drmn0: ring vcn_enc1 uses VM inv eng 5 on hub 1
drmn0: ring jpeg_dec uses VM inv eng 6 on hub 1
vgapci0: child drmn0 requested pci_get_powerstate
sysctl_warn_reuse: can't re-use a leaf (hw.dri.debug)!
[drm] Initialized amdgpu 3.40.0 20150101 for drmn0 on minor 0

Is the sysctl_warn_reuse message anything to worry about?

2. Adding amdgpu to the kldlist in rc.conf still crashes more often than not,
as previously reported.

3. Attempting to load amdgpu via /boot/loader.conf appears to load the module
in memory but not actually make it functional.  (X uses VESA mode as if the
module isn't there.)

-- 
You are receiving this mail because:
You are the assignee for the bug.