[Bug 267028] kernel panics when booting with both (zfs,ko or vboxnetflt,ko or acpi_wmi.ko) and amdgpu.ko
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sun, 15 Dec 2024 20:05:04 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=267028
--- Comment #232 from Mark Millard <marklmi26-fbsd@yahoo.com> ---
(In reply to Mark Millard from comment #228)
MMy crude traversal of the long list of nodes in the list ends with the
sequence:
(kgdb) print *found_modules->tqh_first->link->tqe_next . . .->link->tqe_next
$206 = {link = {tqe_next = 0xfffff8000465bc80, tqe_prev = 0xfffff80004607c40},
container = 0xfffff80004b29a80, name = 0xffffffff829f801d
"amdgpu_raven_ce_bin_fw", version = 1}
(kgdb) print *found_modules->tqh_first->link->tqe_next . . .->link->tqe_next
$207 = {link = {tqe_next = 0xfffff8000465bbc0, tqe_prev = 0xfffff80004607b80},
container = 0xfffff80004b29780, name = 0xffffffff82e12000
<mmhub_client_ids_vega20> "amdgpu_raven_rlc_bin_fw",
version = 1}
(kgdb) print *found_modules->tqh_first->link->tqe_next . . .->link->tqe_next
$208 = {link = {tqe_next = 0xfffff80004607a00, tqe_prev = 0xfffff8000465bc80},
container = 0xfffff80003868c00, name = 0xffffffff82e1e000
<xgpu_fiji_mgcg_cgcg_init+368> "amdgpu_raven_mec_bin_fw",
version = 1}
(kgdb) print *found_modules->tqh_first->link->tqe_next->link->tqe_next . .
.->link->tqe_next
$209 = {link = {tqe_next = 0xfffff80000000007, tqe_prev = 0xfffff8000465bbc0},
container = 0xfffff80004b29600, name = 0xffffffff82e62026 <se_mask+242>
"amdgpu_raven_mec2_bin_fw", version = 1}
(kgdb) print *found_modules->tqh_first->link->tqe_next->link->tqe_next . .
.->link->tqe_next
$210 = {link = {tqe_next = 0xeef3f000e2c3f0, tqe_prev = 0xff54f000eef3f0},
container = 0x322ff0003287f0, name = 0xe987f000fea5f0 <error: Cannot access
memory at address 0xe987f000fea5f0>,
version = 15660016}
The ones that also show prefix <...> text like:
<mmhub_client_ids_vega20> "amdgpu_raven_rlc_bin_fw"
<xgpu_fiji_mgcg_cgcg_init+368> "amdgpu_raven_mec_bin_fw"
<se_mask+242> "amdgpu_raven_mec2_bin_fw"
are not the first ones to do so. Also note the duplication of
"amdgpu_raven_mec2_bin_fw".
Also note the: tqe_next = 0xfffff80000000007 that, when dereferenced,
ends up with clearly garabge for the purpose of the list:
link = {tqe_next = 0xeef3f000e2c3f0, tqe_prev = 0xff54f000eef3f0}, container =
0x322ff0003287f0, name = 0xe987f000fea5f0 <error: Cannot access memory at
address 0xe987f000fea5f0>,
version = 15660016}
For reference:
(kgdb) print &mmhub_client_ids_vega20
$211 = (<data variable, no debug info> *) 0xffffffff82e12000
<mmhub_client_ids_vega20>
(kgdb) print &xgpu_fiji_mgcg_cgcg_init
$212 = (<data variable, no debug info> *) 0xffffffff82e1de90
<xgpu_fiji_mgcg_cgcg_init>
(kgdb) print &se_mask
$213 = (<data variable, no debug info> *) 0xffffffff82e41d7c <se_mask>
Those addresses are in the .rodata for /boot/modules/amdgpu.ko :
0xffffffff81d82880 - 0xffffffff82200000 is .bss
0xffffffff82a00000 - 0xffffffff82d9a000 is .text in
/boot/modules/amdgpu.ko
0xffffffff82d9a000 - 0xffffffff82eea000 is .rodata in
/boot/modules/amdgpu.ko
0xffffffff82eea000 - 0xffffffff82ef7948 is .bss in
/boot/modules/amdgpu.ko
0xffffffff82ef7950 - 0xffffffff82f064b8 is .data in
/boot/modules/amdgpu.ko
0xffffffff82f064b8 - 0xffffffff82f068d0 is set_sysctl_set in
/boot/modules/amdgpu.ko
0xffffffff82f068d0 - 0xffffffff82f068f8 is set_sysinit_set in
/boot/modules/amdgpu.ko
0xffffffff82f068f8 - 0xffffffff82f06908 is set_sysuninit_set in
/boot/modules/amdgpu.ko
0xffffffff82f06908 - 0xffffffff82f06958 is set_modmetadata_set in
/boot/modules/amdgpu.ko
0xffffffff82f06958 - 0xffffffff82f0697c is .note.gnu.build-id in
/boot/modules/amdgpu.ko
That matches up with the node with:
link = {tqe_next = 0xfffff8000465bc80
referencing:
$207 = {link = {tqe_next = 0xfffff8000465bbc0, tqe_prev = 0xfffff80004607b80},
container = 0xfffff80004b29780, name = 0xffffffff82e12000
<mmhub_client_ids_vega20> "amdgpu_raven_rlc_bin_fw"
where name has the address 0xffffffff82e12000 .
I'll note that the name = 0xffffffff829f801d "amdgpu_raven_ce_bin_fw" before
the oddities lands between the .bss for the kernel and the .text
for /boot/modules/amdgpu.ko (not in either one):
0xffffffff81d82880 - 0xffffffff82200000 is .bss
0xffffffff82a00000 - 0xffffffff82d9a000 is .text in
/boot/modules/amdgpu.ko
. . .
0xffffffff829a3360 - 0xffffffff829a3384 is .note.gnu.build-id in
/boot/modules/ttm.ko
For reference, the first node's name field has:
name = 0xffffffff81184803 "cam"
That is in the kernel's .rodata :
0xffffffff81184400 - 0xffffffff817f68d0 is .rodata
there are earlier:
name = 0xffffffff8298b0de <drm_ioctls+350> "iic"
and:
name = 0xffffffff8298f24f <orientation_data+6415> "linuxkpi_gplv2"
in:
0xffffffff82973000 - 0xffffffff82991000 is .rodata in
/boot/modules/drm.ko
name = 0xffffffff829a21c2 <global_write_combined+370> "ttm"
in:
0xffffffff829a2000 - 0xffffffff829a2eb0 is .bss in /boot/modules/ttm.ko
(Not the just prior .rodata for /boot/modules/ttm.ko .)
For reference:
$198 = {link = {tqe_next = 0xfffff80003904d00, tqe_prev = 0xfffff8000465a1c0},
container = 0xfffff8000464b180, name = 0xffffffff8297644b "drmn", version = 2}
has its tqe_next pointing to the ttm using .bss for the name string:
$199 = {link = {tqe_next = 0xfffff8000465bd00, tqe_prev = 0xfffff8000465a3c0},
container = 0xfffff8000469da80, name = 0xffffffff829a21c2
<global_write_combined+370> "ttm", version = 1}
I will note that /boot/modules/ttm.ko is the last (most recent) to
show up in the "info file" kgdb output:
(kgdb) info file
Symbols from
"/usr/home/root/failing-kernel-files/usr/lib/debug/boot/kernel/kernel.debug".
Kernel core dump file:
`/usr/home/root/failing-kernel-files/vmcore.8', file type FreeBSD
kernel vmcore.
Local exec file:
`/usr/home/root/failing-kernel-files/boot/kernel/kernel', file type
elf64-x86-64-freebsd.
Entry point: 0xffffffff8038e000
0xffffffff802002a8 - 0xffffffff802002b5 is .interp
0xffffffff802002b8 - 0xffffffff80231108 is .hash
0xffffffff80231108 - 0xffffffff8025f9e4 is .gnu.hash
0xffffffff8025f9e8 - 0xffffffff802f24c0 is .dynsym
0xffffffff802f24c0 - 0xffffffff8036d162 is .dynstr
0xffffffff8036d168 - 0xffffffff8038db08 is .rela.dyn
0xffffffff8038e000 - 0xffffffff811843f8 is .text
0xffffffff81184400 - 0xffffffff817f68d0 is .rodata
0xffffffff817f68d0 - 0xffffffff817fba38 is set_sysctl_set
0xffffffff817fba38 - 0xffffffff817fef60 is set_modmetadata_set
0xffffffff817fef60 - 0xffffffff817fefb8 is set_cam_xpt_xport_set
0xffffffff817fefb8 - 0xffffffff817fefe0 is set_cam_xpt_proto_set
0xffffffff817fefe0 - 0xffffffff817ff028 is set_ah_chips
0xffffffff817ff028 - 0xffffffff817ff078 is set_ah_rfs
0xffffffff817ff078 - 0xffffffff817ff098 is set_kbddriver_set
0xffffffff817ff098 - 0xffffffff817ff150 is set_sdt_providers_set
0xffffffff817ff150 - 0xffffffff81800268 is set_sdt_probes_set
0xffffffff81800268 - 0xffffffff818035c8 is set_sdt_argtypes_set
0xffffffff818035c8 - 0xffffffff818035e0 is set_scterm_set
0xffffffff818035e0 - 0xffffffff81803608 is set_cons_set
0xffffffff81803608 - 0xffffffff81803610 is
set_uart_acpi_class_and_device_set
0xffffffff81803620 - 0xffffffff81803660 is usb_host_id
0xffffffff81803660 - 0xffffffff81803680 is set_vt_drv_set
0xffffffff81803680 - 0xffffffff818036a8 is set_elf64_regset
0xffffffff818036a8 - 0xffffffff818036d8 is set_elf32_regset
0xffffffff818036d8 - 0xffffffff818036e8 is set_compressors
0xffffffff818036e8 - 0xffffffff818036f0 is set_kdb_dbbe_set
0xffffffff818036f0 - 0xffffffff81803700 is set_ratectl_set
0xffffffff81803700 - 0xffffffff81803718 is set_crypto_set
0xffffffff81803718 - 0xffffffff81803730 is set_ieee80211_ioctl_getset
0xffffffff81803730 - 0xffffffff81803748 is set_ieee80211_ioctl_setset
0xffffffff81803748 - 0xffffffff81803770 is set_scanner_set
0xffffffff81803770 - 0xffffffff81803790 is set_videodriver_set
0xffffffff81803790 - 0xffffffff818037d8 is set_scrndr_set
0xffffffff818037d8 - 0xffffffff81803820 is set_vga_set
0xffffffff81803820 - 0xffffffff81804881 is kern_conf
0xffffffff81804884 - 0xffffffff818048a8 is .note.gnu.build-id
0xffffffff818048a8 - 0xffffffff8180493c is .eh_frame
0xffffffff81a00000 - 0xffffffff81a00140 is .dynamic
0xffffffff81a00140 - 0xffffffff81a01000 is .relro_padding
0xffffffff81c00000 - 0xffffffff81c00035 is .data.read_frequently
0xffffffff81c00040 - 0xffffffff81c017f4 is .data.read_mostly
0xffffffff81c01800 - 0xffffffff81c07680 is .data.exclusive_cache_line
0xffffffff81c08000 - 0xffffffff81d51248 is .data
0xffffffff81d51248 - 0xffffffff81d54688 is set_sysinit_set
0xffffffff81d54688 - 0xffffffff81d55e48 is set_sysuninit_set
0xffffffff81d55e80 - 0xffffffff81d592e8 is set_pcpu
0xffffffff81d592f0 - 0xffffffff81d82851 is set_vnet
0xffffffff81d82880 - 0xffffffff82200000 is .bss
0xffffffff82a00000 - 0xffffffff82d9a000 is .text in
/boot/modules/amdgpu.ko
0xffffffff82d9a000 - 0xffffffff82eea000 is .rodata in
/boot/modules/amdgpu.ko
0xffffffff82eea000 - 0xffffffff82ef7948 is .bss in
/boot/modules/amdgpu.ko
0xffffffff82ef7950 - 0xffffffff82f064b8 is .data in
/boot/modules/amdgpu.ko
0xffffffff82f064b8 - 0xffffffff82f068d0 is set_sysctl_set in
/boot/modules/amdgpu.ko
0xffffffff82f068d0 - 0xffffffff82f068f8 is set_sysinit_set in
/boot/modules/amdgpu.ko
0xffffffff82f068f8 - 0xffffffff82f06908 is set_sysuninit_set in
/boot/modules/amdgpu.ko
0xffffffff82f06908 - 0xffffffff82f06958 is set_modmetadata_set in
/boot/modules/amdgpu.ko
0xffffffff82f06958 - 0xffffffff82f0697c is .note.gnu.build-id in
/boot/modules/amdgpu.ko
0xffffffff82918000 - 0xffffffff82973000 is .text in
/boot/modules/drm.ko
0xffffffff82973000 - 0xffffffff82991000 is .rodata in
/boot/modules/drm.ko
0xffffffff82991000 - 0xffffffff829911e0 is .bss in /boot/modules/drm.ko
0xffffffff829911e0 - 0xffffffff82992df8 is .data in
/boot/modules/drm.ko
0xffffffff82992df8 - 0xffffffff82992e80 is set_sysinit_set in
/boot/modules/drm.ko
0xffffffff82992e80 - 0xffffffff82992ef0 is set_sysuninit_set in
/boot/modules/drm.ko
0xffffffff82992ef0 - 0xffffffff82992fc0 is set_sysctl_set in
/boot/modules/drm.ko
0xffffffff82992fc0 - 0xffffffff82992fcc is .data.read_mostly in
/boot/modules/drm.ko
0xffffffff82992fd0 - 0xffffffff82993050 is set_modmetadata_set in
/boot/modules/drm.ko
0xffffffff82993050 - 0xffffffff82993074 is .note.gnu.build-id in
/boot/modules/drm.ko
0xffffffff8298d000 - 0xffffffff8298d000 is .text in
/boot/modules/linuxkpi_gplv2.ko
0xffffffff8298d000 - 0xffffffff8298e000 is .rodata in
/boot/modules/linuxkpi_gplv2.ko
0xffffffff8298e000 - 0xffffffff8298e0d0 is .data in
/boot/modules/linuxkpi_gplv2.ko
0xffffffff8298e0d0 - 0xffffffff8298e100 is set_modmetadata_set in
/boot/modules/linuxkpi_gplv2.ko
0xffffffff8298e100 - 0xffffffff8298e124 is .note.gnu.build-id in
/boot/modules/linuxkpi_gplv2.ko
0xffffffff82991000 - 0xffffffff82996000 is .text in
/boot/modules/dmabuf.ko
0xffffffff82996000 - 0xffffffff82997000 is .rodata in
/boot/modules/dmabuf.ko
0xffffffff82997000 - 0xffffffff82997280 is .data in
/boot/modules/dmabuf.ko
0xffffffff82997280 - 0xffffffff82997290 is set_modmetadata_set in
/boot/modules/dmabuf.ko
0xffffffff82997290 - 0xffffffff829972a8 is set_sysinit_set in
/boot/modules/dmabuf.ko
0xffffffff829972a8 - 0xffffffff829972c0 is set_sysuninit_set in
/boot/modules/dmabuf.ko
0xffffffff829972c0 - 0xffffffff82997358 is .bss in
/boot/modules/dmabuf.ko
0xffffffff82997358 - 0xffffffff8299737c is .note.gnu.build-id in
/boot/modules/dmabuf.ko
0xffffffff82998000 - 0xffffffff829a1000 is .text in
/boot/modules/ttm.ko
0xffffffff829a1000 - 0xffffffff829a2000 is .rodata in
/boot/modules/ttm.ko
0xffffffff829a2000 - 0xffffffff829a2eb0 is .bss in /boot/modules/ttm.ko
0xffffffff829a2eb0 - 0xffffffff829a32e8 is .data in
/boot/modules/ttm.ko
0xffffffff829a32e8 - 0xffffffff829a3320 is set_sysctl_set in
/boot/modules/ttm.ko
0xffffffff829a3320 - 0xffffffff829a3350 is set_modmetadata_set in
/boot/modules/ttm.ko
0xffffffff829a3350 - 0xffffffff829a3358 is set_sysinit_set in
/boot/modules/ttm.ko
0xffffffff829a3358 - 0xffffffff829a3360 is set_sysuninit_set in
/boot/modules/ttm.ko
0xffffffff829a3360 - 0xffffffff829a3384 is .note.gnu.build-id in
/boot/modules/ttm.ko
(kgdb)
For reference:
$210 = {link = {tqe_next = 0xeef3f000e2c3f0, tqe_prev = 0xff54f000eef3f0},
container = 0x322ff0003287f0, name = 0xe987f000fea5f0 <error: Cannot access
memory at address 0xe987f000fea5f0>,
happens after previously dereferencing to see over something like
200 nodes in the list.
With that I'll stop this specific note.
--
You are receiving this mail because:
You are on the CC list for the bug.