[Bug 277476] graphics/drm-515-kmod: amdgpu periodic hangs due to phys contig allocations
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Wed, 12 Mar 2025 20:09:54 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=277476
--- Comment #20 from Ivan Rozhuk <rozhuk.im@gmail.com> ---
Created attachment 258607
--> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=258607&action=edit
dtrace profile
Patch did not help, at least in case: xorg + amdgpu xdriver.
This how it was landed on 14/stable:
https://github.com/rozhuk-im/freebsd/commit/b739c10c50aa37e247dc95f7b93f6fe58d86016d
I have attached dtrace profile output that captured while freezes happen.
I do not see here vm_phys_alloc_contig() after ttm_pool_alloc(), probably
-O2/-O3 opt level "optimize" out it.
Here few new things that show increased latency on freezes:
(I do not collect many freezes, in some tests only few freezes collected)
kernel`lock_delay+0x12
amdgpu.ko`amdgpu_gem_fault+0x86
kernel`linux_cdev_pager_populate+0x128
kernel`vm_fault_allocate+0x185
kernel`vm_fault+0x39c
kernel`vm_fault_trap+0x4c
kernel`trap_pfault+0x20a
kernel`trap+0x4a8
kernel`0xffffffff80a11ca8
20
dtrace -n 'fbt::amdgpu_gem_fault:entry{self->ts=timestamp}' -n
'fbt::amdgpu_gem_fault:return/self->ts/{this->delta=timestamp-self->ts;
@=quantize(this->delta);}' -n 'tick-1sec{printa(@)}'
0 66064 :tick-1sec
value ------------- Distribution ------------- count
512 | 0
1024 | 1
2048 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1357
4096 |@@@ 110
8192 |@@@ 103
16384 | 4
32768 | 3
65536 | 0
131072 | 0
262144 | 0
524288 | 0
1048576 | 0
2097152 | 1
4194304 | 3
8388608 | 2
16777216 | 0
33554432 | 0
67108864 | 0
134217728 | 0
268435456 | 1
536870912 | 1
1073741824 | 1
2147483648 | 0
kernel`lock_delay+0x14
kernel`malloc_large+0x2c
kernel`lkpi_kmalloc_cb+0x44
kernel`lkpi_kmalloc+0x27
amdgpu.ko`dc_create_state+0x18
amdgpu.ko`amdgpu_dm_atomic_commit_tail+0xd4
drm.ko`commit_tail+0xa7
kernel`linux_work_fn+0xed
kernel`taskqueue_run_locked+0x187
kernel`taskqueue_thread_loop+0xc2
kernel`fork_exit+0x86
kernel`0xffffffff80a12d0e
88
dtrace -n 'fbt::dc_create_state:entry{self->ts=timestamp}' -n
'fbt::dc_create_state:return/self->ts/{this->delta=timestamp-self->ts;
@=quantize(this->delta);}' -n 'tick-1sec{printa(@)}'
0 66064 :tick-1sec
value ------------- Distribution ------------- count
4096 | 0
8192 | 2
16384 |@@@@@@@@@@@@@@@@@@@@@ 1271
32768 |@@@@@@@@@@@@@@@@@@ 1087
65536 | 30
131072 | 1
262144 | 0
524288 | 3
1048576 | 4
2097152 | 2
4194304 | 0
8388608 | 0
16777216 | 0
33554432 | 0
67108864 | 0
134217728 | 1
268435456 | 1
536870912 | 5
1073741824 | 4
2147483648 | 0
kernel`lock_delay+0x14
kernel`free+0x9b
amdgpu.ko`amdgpu_dm_atomic_commit_tail+0x2f9a
drm.ko`commit_tail+0xa7
kernel`linux_work_fn+0xed
kernel`taskqueue_run_locked+0x187
kernel`taskqueue_thread_loop+0xc2
kernel`fork_exit+0x86
kernel`0xffffffff809aaf6e
399
dtrace -n 'fbt::amdgpu_dm_atomic_commit_tail:entry{self->ts=timestamp}' -n
'fbt::amdgpu_dm_atomic_commit_tail:return/self->ts/{this->delta=timestamp-self->ts;
@=quantize(this->delta);}' -n 'tick-1sec{printa(@)}'
0 66190 :tick-1sec
value ------------- Distribution ------------- count
16384 | 0
32768 | 4
65536 | 6
131072 | 2
262144 | 0
524288 | 6
1048576 | 15
2097152 |@ 29
4194304 |@@@ 106
8388608 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1323
16777216 |@ 44
33554432 | 1
67108864 | 2
134217728 | 4
268435456 | 8
536870912 | 5
1073741824 | 5
2147483648 | 0
kernel`lock_delay+0x14
kernel`zone_import+0xf2
kernel`cache_alloc+0x309
kernel`cache_alloc_retry+0x2c
kernel`malloc+0x48
ttm.ko`ttm_sg_tt_init+0x61
amdgpu.ko`amdgpu_ttm_tt_create+0x4a
ttm.ko`ttm_tt_create+0x4e
ttm.ko`ttm_bo_validate+0x60
ttm.ko`ttm_bo_init_reserved+0x194
amdgpu.ko`amdgpu_bo_create+0x295
amdgpu.ko`amdgpu_bo_create_user+0x21
amdgpu.ko`amdgpu_gem_userptr_ioctl+0x82
drm.ko`drm_ioctl_kernel+0xbc
drm.ko`drm_ioctl+0x25e
kernel`linux_file_ioctl+0x30f
kernel`kern_ioctl+0x1b0
kernel`sys_ioctl+0x117
kernel`amd64_syscall+0xeb
kernel`0xffffffff809aa81b
46
dtrace -n 'fbt::amdgpu_ttm_tt_create:entry{self->ts=timestamp}' -n
'fbt::amdgpu_ttm_tt_create:return/self->ts/{this->delta=timestamp-self->ts;
@=quantize(this->delta);}' -n 'tick-1sec{printa(@)}'
0 66190 :tick-1sec
value ------------- Distribution ------------- count
128 | 0
256 | 4
512 |@@@@@@@@@@ 5764
1024 |@@@@@@@@@@@@ 6635
2048 |@@@@@@@@@ 5087
4096 |@@@@@@@@ 4334
8192 |@@ 875
16384 | 72
32768 | 9
65536 | 3
131072 | 0
(this looks ok)
dtrace -n 'fbt::amdgpu_bo_create:entry{self->ts=timestamp}' -n
'fbt::amdgpu_bo_create:return/self->ts/{this->delta=timestamp-self->ts;
@=quantize(this->delta);}' -n 'tick-1sec{printa(@)}'
0 66190 :tick-1sec
value ------------- Distribution ------------- count
256 | 0
512 | 2
1024 |@@@@@@ 2303
2048 |@@@@@@@@@@ 4190
4096 |@@@@@@@@@@@@ 4800
8192 |@@@@@@@@@@ 4002
16384 |@@ 845
32768 | 124
65536 | 39
131072 | 20
262144 | 4
524288 | 9
1048576 | 2
2097152 | 3
4194304 | 8
8388608 | 5
16777216 | 0
33554432 | 1
67108864 | 2
134217728 | 1
268435456 | 0
536870912 | 2
1073741824 | 0
2147483648 | 1
4294967296 | 0
dtrace -n 'fbt::add_hole:entry{self->ts=timestamp}' -n
'fbt::add_hole:return/self->ts/{this->delta=timestamp-self->ts;
@=quantize(this->delta);}' -n 'tick-1sec{printa(@)}'
0 66190 :tick-1sec
value ------------- Distribution ------------- count
128 | 0
256 |@@@@@@@@@@@@ 5762
512 |@@@@@@@@@@@@@ 6548
1024 |@@@@@@@ 3287
2048 |@@@@@ 2648
4096 |@@@ 1508
8192 | 105
16384 | 11
32768 | 4
65536 | 1
131072 | 0
(this looks ok)
dtrace -n 'fbt::ttm_pool_alloc:entry{self->ts=timestamp}' -n
'fbt::ttm_pool_alloc:return/self->ts/{this->delta=timestamp-self->ts;
@=quantize(this->delta);}' -n 'tick-1sec{printa(@)}'
0 66190 :tick-1sec
value ------------- Distribution ------------- count
128 | 0
256 |@@ 29
512 |@@@@@@@@ 96
1024 |@@@@@@ 81
2048 |@@@@@ 67
4096 |@@@@@@@@ 106
8192 |@@@ 33
16384 | 5
32768 | 1
65536 |@ 17
131072 |@ 12
262144 | 3
524288 | 6
1048576 |@ 10
2097152 |@ 13
4194304 | 2
8388608 | 2
16777216 | 3
33554432 | 5
67108864 | 3
134217728 | 4
268435456 |@ 7
536870912 | 5
1073741824 | 1
2147483648 | 1
4294967296 | 0
If some one have ideas - I can play more with dtrace and test other
patches/settings.
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.