Re: drm panic after new world
- In reply to: Steve Kargl : "Re: drm panic after new world"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Tue, 03 Jun 2025 10:08:00 UTC
On Thu, 29 May 2025, Steve Kargl wrote:
> On Thu, May 29, 2025 at 01:06:22PM -0700, Steve Kargl wrote:
>> __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57
>> 57 __asm("movq %%gs:%c1,%0" : "=r" (td)
>> (kgdb) #0 __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57
>> td = <optimized out>
>
> (snip)
>
>> #5 0xffffffff805c8718 in pfs_add_node (
>> parent=parent@entry=0xfffff80003955400, pn=pn@entry=0xfffff803557e0900)
>> at /usr/src/sys/fs/pseudofs/pseudofs.c:123
>> iter = <optimized out>
>
> This is hitting a KASSERT under the INVARIANTS option.
Yes, and once again pretty useless information. I am adding name and
type to it so we get better ideas right away just from the panic string.
Thankfully the name is not optimized out in frame #7: 'radeon_ring_gfx'
>> #6 0xffffffff805c8bd2 in pfs_create_file (parent=0xfffff80003955400,
>> name=name@entry=0xffffffff82b293f4 "radeon_ring_gfx",
>> fill=0xffffffff82bf70f0 <debugfs_fill>,
>> attr=0xffffffff82bf72f0 <debugfs_attr>, vis=vis@entry=0x0,
>> destroy=0xffffffff82bf7310 <debugfs_destroy>, flags=33)
>> at /usr/src/sys/fs/pseudofs/pseudofs.c:266
>> pn = 0xfffff803557e0900
>> #7 0xffffffff82bf70b8 in debugfs_create_file (
>> name=0xffffffff82b293f4 "radeon_ring_gfx", mode=292,
>> parent=0xfffff8000398e400, data=0xfffffe012354dd30,
>> fops=0xffffffff82b55918 <radeon_debugfs_ring_info_fops>)
>> at /usr/src/sys/compat/lindebugfs/lindebugfs.c:209
There were changes to that adding a new function or using __func__
in the timeframe you mention.
But could also be that CONFIG_DEBUG_FS was turned on somewhere which was
not before or it's because you are running a debug kernel instaed of a
no-debug?
>> dm = 0xfffff80003990580
>> dnode = 0xfffff80003990580
>> pnode = <unavailable>
>> flags = <optimized out>
>> _size = <optimized out>
>> _malloc_item = <optimized out>
>> #8 0xffffffff82ad0084 in radeon_ring_init () from /boot/modules/radeonkms.ko
>> No symbol table info available.
>
> How does one get kernel debugging symbols into radeonkms.ko?
I think if you do the buildkernel/installkernel with
LOCAL_MODULES_DIR=/path/to/drm/sources they are likely to be there in
the right place. I don't know how this works with ports but also not my
area of expertise.
Looking at 6.6 sources:
My suspicion is given the path is reset/resume and that calls
radeon_ring_init() for the RADEON_RING_TYPE_GFX_INDEX, that the original init
path likely did the same but no one cleaned things up.
#8 0xffffffff82ad0084 in radeon_ring_init () from /boot/modules/radeonkms.ko
#9 0xffffffff82a5caf7 in evergreen_startup () from /boot/modules/radeonkms.ko
#10 0xffffffff82a5b333 in evergreen_resume () from /boot/modules/radeonkms.ko
#11 0xffffffff82ab3e90 in radeon_gpu_reset () from /boot/modules/radeonkms.ko
The evergreen_startup() function doing the call ..
5083 ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
5084 r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET,
5085 RADEON_CP_PACKET2);
.. is called from evergreen_resume() and evergreen_init().
Would be interesting to know when and how often you pass these functions
during boot before panic.
You could try adding a dump_stack() there and the message buffer from
the core file should likely tell us:
% git diff
diff --git drivers/gpu/drm/radeon/evergreen.c drivers/gpu/drm/radeon/evergreen.c
index eedb7dec0f..a6ae0cd9c4 100644
--- drivers/gpu/drm/radeon/evergreen.c
+++ drivers/gpu/drm/radeon/evergreen.c
@@ -5080,6 +5080,8 @@ static int evergreen_startup(struct radeon_device *rdev)
}
evergreen_irq_set(rdev);
+ dump_stack();
+
ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET,
RADEON_CP_PACKET2);
--
Bjoern A. Zeeb r15:7