[Bug 277476] graphics/drm-515-kmod: amdgpu periodic hangs due to phys contig allocations

From: <bugzilla-noreply_at_freebsd.org>
Date: Wed, 12 Mar 2025 20:09:54 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=277476

--- Comment #20 from Ivan Rozhuk <rozhuk.im@gmail.com> ---
Created attachment 258607
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=258607&action=edit
dtrace profile

Patch did not help, at least in case: xorg + amdgpu xdriver.
This how it was landed on 14/stable:
https://github.com/rozhuk-im/freebsd/commit/b739c10c50aa37e247dc95f7b93f6fe58d86016d


I have attached dtrace profile output that captured while freezes happen.
I do not see here vm_phys_alloc_contig() after ttm_pool_alloc(), probably
-O2/-O3 opt level "optimize" out it.

Here few new things that show increased latency on freezes:
(I do not collect many freezes, in some tests only few freezes collected)

              kernel`lock_delay+0x12
              amdgpu.ko`amdgpu_gem_fault+0x86
              kernel`linux_cdev_pager_populate+0x128
              kernel`vm_fault_allocate+0x185
              kernel`vm_fault+0x39c
              kernel`vm_fault_trap+0x4c
              kernel`trap_pfault+0x20a
              kernel`trap+0x4a8
              kernel`0xffffffff80a11ca8
               20
dtrace -n 'fbt::amdgpu_gem_fault:entry{self->ts=timestamp}' -n
'fbt::amdgpu_gem_fault:return/self->ts/{this->delta=timestamp-self->ts;
@=quantize(this->delta);}' -n 'tick-1sec{printa(@)}'
  0  66064                       :tick-1sec 

           value  ------------- Distribution ------------- count    
             512 |                                         0        
            1024 |                                         1        
            2048 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@       1357     
            4096 |@@@                                      110      
            8192 |@@@                                      103      
           16384 |                                         4        
           32768 |                                         3        
           65536 |                                         0        
          131072 |                                         0        
          262144 |                                         0        
          524288 |                                         0        
         1048576 |                                         0        
         2097152 |                                         1        
         4194304 |                                         3        
         8388608 |                                         2        
        16777216 |                                         0        
        33554432 |                                         0        
        67108864 |                                         0        
       134217728 |                                         0        
       268435456 |                                         1        
       536870912 |                                         1        
      1073741824 |                                         1        
      2147483648 |                                         0        



              kernel`lock_delay+0x14
              kernel`malloc_large+0x2c
              kernel`lkpi_kmalloc_cb+0x44
              kernel`lkpi_kmalloc+0x27
              amdgpu.ko`dc_create_state+0x18
              amdgpu.ko`amdgpu_dm_atomic_commit_tail+0xd4
              drm.ko`commit_tail+0xa7
              kernel`linux_work_fn+0xed
              kernel`taskqueue_run_locked+0x187
              kernel`taskqueue_thread_loop+0xc2
              kernel`fork_exit+0x86
              kernel`0xffffffff80a12d0e
               88
dtrace -n 'fbt::dc_create_state:entry{self->ts=timestamp}' -n
'fbt::dc_create_state:return/self->ts/{this->delta=timestamp-self->ts;
@=quantize(this->delta);}' -n 'tick-1sec{printa(@)}'
  0  66064                       :tick-1sec 

           value  ------------- Distribution ------------- count    
            4096 |                                         0        
            8192 |                                         2        
           16384 |@@@@@@@@@@@@@@@@@@@@@                    1271     
           32768 |@@@@@@@@@@@@@@@@@@                       1087     
           65536 |                                         30       
          131072 |                                         1        
          262144 |                                         0        
          524288 |                                         3        
         1048576 |                                         4        
         2097152 |                                         2        
         4194304 |                                         0        
         8388608 |                                         0        
        16777216 |                                         0        
        33554432 |                                         0        
        67108864 |                                         0        
       134217728 |                                         1        
       268435456 |                                         1        
       536870912 |                                         5        
      1073741824 |                                         4        
      2147483648 |                                         0        


              kernel`lock_delay+0x14
              kernel`free+0x9b
              amdgpu.ko`amdgpu_dm_atomic_commit_tail+0x2f9a
              drm.ko`commit_tail+0xa7
              kernel`linux_work_fn+0xed
              kernel`taskqueue_run_locked+0x187
              kernel`taskqueue_thread_loop+0xc2
              kernel`fork_exit+0x86
              kernel`0xffffffff809aaf6e
              399
dtrace -n 'fbt::amdgpu_dm_atomic_commit_tail:entry{self->ts=timestamp}' -n
'fbt::amdgpu_dm_atomic_commit_tail:return/self->ts/{this->delta=timestamp-self->ts;
@=quantize(this->delta);}' -n 'tick-1sec{printa(@)}'
  0  66190                       :tick-1sec 

           value  ------------- Distribution ------------- count    
           16384 |                                         0        
           32768 |                                         4        
           65536 |                                         6        
          131072 |                                         2        
          262144 |                                         0        
          524288 |                                         6        
         1048576 |                                         15       
         2097152 |@                                        29       
         4194304 |@@@                                      106      
         8388608 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@       1323     
        16777216 |@                                        44       
        33554432 |                                         1        
        67108864 |                                         2        
       134217728 |                                         4        
       268435456 |                                         8        
       536870912 |                                         5        
      1073741824 |                                         5        
      2147483648 |                                         0        


              kernel`lock_delay+0x14
              kernel`zone_import+0xf2
              kernel`cache_alloc+0x309
              kernel`cache_alloc_retry+0x2c
              kernel`malloc+0x48
              ttm.ko`ttm_sg_tt_init+0x61
              amdgpu.ko`amdgpu_ttm_tt_create+0x4a
              ttm.ko`ttm_tt_create+0x4e
              ttm.ko`ttm_bo_validate+0x60
              ttm.ko`ttm_bo_init_reserved+0x194
              amdgpu.ko`amdgpu_bo_create+0x295
              amdgpu.ko`amdgpu_bo_create_user+0x21
              amdgpu.ko`amdgpu_gem_userptr_ioctl+0x82
              drm.ko`drm_ioctl_kernel+0xbc
              drm.ko`drm_ioctl+0x25e
              kernel`linux_file_ioctl+0x30f
              kernel`kern_ioctl+0x1b0
              kernel`sys_ioctl+0x117
              kernel`amd64_syscall+0xeb
              kernel`0xffffffff809aa81b
               46
dtrace -n 'fbt::amdgpu_ttm_tt_create:entry{self->ts=timestamp}' -n
'fbt::amdgpu_ttm_tt_create:return/self->ts/{this->delta=timestamp-self->ts;
@=quantize(this->delta);}' -n 'tick-1sec{printa(@)}'
  0  66190                       :tick-1sec 

           value  ------------- Distribution ------------- count    
             128 |                                         0        
             256 |                                         4        
             512 |@@@@@@@@@@                               5764     
            1024 |@@@@@@@@@@@@                             6635     
            2048 |@@@@@@@@@                                5087     
            4096 |@@@@@@@@                                 4334     
            8192 |@@                                       875      
           16384 |                                         72       
           32768 |                                         9        
           65536 |                                         3        
          131072 |                                         0        
(this looks ok)


dtrace -n 'fbt::amdgpu_bo_create:entry{self->ts=timestamp}' -n
'fbt::amdgpu_bo_create:return/self->ts/{this->delta=timestamp-self->ts;
@=quantize(this->delta);}' -n 'tick-1sec{printa(@)}'
  0  66190                       :tick-1sec 

           value  ------------- Distribution ------------- count    
             256 |                                         0        
             512 |                                         2        
            1024 |@@@@@@                                   2303     
            2048 |@@@@@@@@@@                               4190     
            4096 |@@@@@@@@@@@@                             4800     
            8192 |@@@@@@@@@@                               4002     
           16384 |@@                                       845      
           32768 |                                         124      
           65536 |                                         39       
          131072 |                                         20       
          262144 |                                         4        
          524288 |                                         9        
         1048576 |                                         2        
         2097152 |                                         3        
         4194304 |                                         8        
         8388608 |                                         5        
        16777216 |                                         0        
        33554432 |                                         1        
        67108864 |                                         2        
       134217728 |                                         1        
       268435456 |                                         0        
       536870912 |                                         2        
      1073741824 |                                         0        
      2147483648 |                                         1        
      4294967296 |                                         0        

dtrace -n 'fbt::add_hole:entry{self->ts=timestamp}' -n
'fbt::add_hole:return/self->ts/{this->delta=timestamp-self->ts;
@=quantize(this->delta);}' -n 'tick-1sec{printa(@)}'
  0  66190                       :tick-1sec 

           value  ------------- Distribution ------------- count    
             128 |                                         0        
             256 |@@@@@@@@@@@@                             5762     
             512 |@@@@@@@@@@@@@                            6548     
            1024 |@@@@@@@                                  3287     
            2048 |@@@@@                                    2648     
            4096 |@@@                                      1508     
            8192 |                                         105      
           16384 |                                         11       
           32768 |                                         4        
           65536 |                                         1        
          131072 |                                         0        
(this looks ok)

dtrace -n 'fbt::ttm_pool_alloc:entry{self->ts=timestamp}' -n
'fbt::ttm_pool_alloc:return/self->ts/{this->delta=timestamp-self->ts;
@=quantize(this->delta);}' -n 'tick-1sec{printa(@)}'
  0  66190                       :tick-1sec 

           value  ------------- Distribution ------------- count    
             128 |                                         0        
             256 |@@                                       29       
             512 |@@@@@@@@                                 96       
            1024 |@@@@@@                                   81       
            2048 |@@@@@                                    67       
            4096 |@@@@@@@@                                 106      
            8192 |@@@                                      33       
           16384 |                                         5        
           32768 |                                         1        
           65536 |@                                        17       
          131072 |@                                        12       
          262144 |                                         3        
          524288 |                                         6        
         1048576 |@                                        10       
         2097152 |@                                        13       
         4194304 |                                         2        
         8388608 |                                         2        
        16777216 |                                         3        
        33554432 |                                         5        
        67108864 |                                         3        
       134217728 |                                         4        
       268435456 |@                                        7        
       536870912 |                                         5        
      1073741824 |                                         1        
      2147483648 |                                         1        
      4294967296 |                                         0        


If some one have ideas - I can play more with dtrace and test other
patches/settings.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.