[Bug 277476] graphics/drm-515-kmod: amdgpu periodic hangs due to phys contig allocations

From: <bugzilla-noreply_at_freebsd.org>
Date: Tue, 09 Sep 2025 07:58:50 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=277476

--- Comment #39 from commit-hook@FreeBSD.org ---
A commit in branch main references this bug:

URL:
https://cgit.FreeBSD.org/src/commit/?id=d440953942372ca275d0743a6e220631bde440ee

commit d440953942372ca275d0743a6e220631bde440ee
Author:     Olivier Certner <olce@FreeBSD.org>
AuthorDate: 2025-07-07 20:29:12 +0000
Commit:     Olivier Certner <olce@FreeBSD.org>
CommitDate: 2025-09-09 07:56:45 +0000

    vm_domainset: Only probe domains once when iterating, instead of up to 4
times

    Because of the 'di_minskip' logic, which resets the initial domain, an
    iterator starts by considering only domains that have more than
    'free_min' pages in a first phase, and then all domains in a second one.
    Non-"underpaged" domains are thus examined twice, even if the allocation
    can't succeed.

    Re-scanning the same domains twice just wastes time, as allocation
    attempts that must not wait may rely on failing sooner and those that
    must will loop anyway (a domain previously scanned twice has more pages
    than 'free_min' and consequently vm_wait_doms() will just return
    immediately).

    Additionally, the DOMAINSET_POLICY_FIRSTTOUCH policy would aggravate
    this situation by reexamining the current domain again at the end of
    each phase.  In the case of a single domain, this means doubling again
    the number of times domain 0 is probed.

    Implementation consists in adding two 'domainset_t' to 'struct
    vm_domainset_iter' (and removing the 'di_n' counter).  The first,
    'di_remain_mask', contains domains still to be explored in the current
    phase, the first phase concerning only domains with more pages than
    'free_min' ('di_minskip' true) and the second one concerning only
    domains previously under 'free_min' ('di_minskip' false).  The second,
    'di_min_mask', holds the domains with less pages than 'free_min'
    encountered during the first phase, and serves as the reset value for
    'di_remain_mask' when transitioning to the second phase.

    PR:             277476
    Fixes:          e5818a53dbd2 ("Implement several enhancements to NUMA
policies.")
    Fixes:          23984ce5cd24 ("Avoid resource deadlocks when one domain has
exhausted its memory."...)
    MFC after:      10 days
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D51249

 sys/vm/vm_domainset.c | 53 ++++++++++++++++++++++++++++++---------------------
 sys/vm/vm_domainset.h |  6 +++++-
 2 files changed, 36 insertions(+), 23 deletions(-)

-- 
You are receiving this mail because:
You are on the CC list for the bug.