From nobody Tue May 09 17:11:58 2023 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4QG4TB6b11z49Chw; Tue, 9 May 2023 17:11:58 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4QG4TB5mLxz47Lb; Tue, 9 May 2023 17:11:58 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1683652318; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=17Y2DXTRLtW6F6898Iz7nQrr/9t/mo1IoSNgH37lz9Q=; b=IEKCK9ot1gxMFe2eBXwb7hWYiMASf5x6+IpTFLsvZ8dtEUerxolFM4aJDkHJI0B0T242bb S2UNRrHHFvj3ra5q/l75efFLs3fkp+UVLjrcXqHE8E/7ncC+zsI31ehGtxhlJ8QsBXKHO0 MLzaj8T6H5IUFyw/cX5sNCtHLfD37PJYy3k+KCfQuFJS8BfgBqNJS8+l0/NNB4KXIuw/9k l+lJsZCQrtIAZATubeuRLQctotDA6p9LNLvx+zfDunmUYq3BaGKCUF2XasacFmsxchQdE/ vkl1UsSqQtMo5XbEd7gHl2XN3Roi4O4aT8j7OgyE8da5+HeV1FJQnVEJL0DzQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1683652318; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=17Y2DXTRLtW6F6898Iz7nQrr/9t/mo1IoSNgH37lz9Q=; b=LdggTYk5frcyYtjH/2498aSewMtVefiDk755RReFP5FIzFMBvT48CJWhw2O2+UoLmNW0WM +yhT/9ouf3JaVocvH/an+/Jf+3Hx4D6TS039mnwQ32TbZZRWyDmkdP5lOL8/lQ+YHkF6rQ KwKlbUegkjy2LsSwJ8TCohQgVx3+DDfFX3IKg9suOjWuesSL+vQJ2w/5Fsg4n2U1TmwmeL rrh4GM4Rone3CYYbGZoB6iDWwFTs++Jr11CQHajXM1vq8POJWNE8vbWFrOIaSZQqbCQ0Dz Eld0r69FVvYRmypNvdCid2byz0w4xcUm6n756303Awowjwa+dDr4OrqQ0p7+Wg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1683652318; a=rsa-sha256; cv=none; b=yIfVFhc83Bms27LWBnf4bWBbGFl/GrxaJTdwxVyIKqLaZpDbfqFfadF2fxqv79Zw2zS8D9 VfDMTJE6uKxCZHeKH0josybJzui8F/Uh/3rvblTAaUm1I71LLK0dp9SlgahcHNvZKsmRja P3z3/bEmXU/kGQZygFf5naa5YooRJ1xitSySntCXWFCpTIYKfKQ/Oucl33ttFLFjEcXC1L hZ1Vyq0qInJpg7vbubiFuoZyX2sPiU4P2fFsy9azQKvQdY85Pm88o/O8QmThGEhKPoVAFv c6Uaa1YSfe4XZA1lN1BIWybrPUp+wxBCOXR/c/MIUqNoe6eufUdMrXstnZzsJQ== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4QG4TB4qGTzbsF; Tue, 9 May 2023 17:11:58 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 349HBwEn095718; Tue, 9 May 2023 17:11:58 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 349HBwCs095717; Tue, 9 May 2023 17:11:58 GMT (envelope-from git) Date: Tue, 9 May 2023 17:11:58 GMT Message-Id: <202305091711.349HBwCs095717@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Andrew Gallatin Subject: git: 8b0dafdb2f18 - main - vm: implement vm_page_reclaim_contig_domain_ext() List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: gallatin X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 8b0dafdb2f18b9bdc464a4ddbcfd749c3d3875f1 Auto-Submitted: auto-generated X-ThisMailContainsUnwantedMimeParts: N The branch main has been updated by gallatin: URL: https://cgit.FreeBSD.org/src/commit/?id=8b0dafdb2f18b9bdc464a4ddbcfd749c3d3875f1 commit 8b0dafdb2f18b9bdc464a4ddbcfd749c3d3875f1 Author: Andrew Gallatin AuthorDate: 2023-05-08 13:25:40 +0000 Commit: Andrew Gallatin CommitDate: 2023-05-09 17:09:34 +0000 vm: implement vm_page_reclaim_contig_domain_ext() Implement vm_page_reclaim_contig_domain_ext() to reclaim multiple contiguous regions at once. This makes it more efficient for users that need multiple contiguous regions to reclaim those regions efficiently. This is needed because callers like ktls may need to reclaim many contiguous regions, and each scan of physical memory can take multiple seconds on a large memory machine (order of 100GB of RMA). Rather than modifying the core algorithm, I extended vm_page_reclaim_contig_domain() to take a "desired_runs" argument to allow the caller to request that it reclaim more than just a single run. There is no functional change intended for all existing callers. The first user for this interface is the ktls code (https://reviews.freebsd.org/D39421). By reclaiming multiple runs, ktls goes from consuming hours of CPU to refill its buffer zone to just seconds or minutes. Differential Revision: https://reviews.freebsd.org/D39739 Sponsored by: Netflix Reviewed by: alc, jhb, markj --- sys/vm/vm_page.c | 69 +++++++++++++++++++++++++++++++++++++++++++------------- sys/vm/vm_page.h | 3 +++ 2 files changed, 56 insertions(+), 16 deletions(-) diff --git a/sys/vm/vm_page.c b/sys/vm/vm_page.c index 90413f235ec0..4b967a94aa1f 100644 --- a/sys/vm/vm_page.c +++ b/sys/vm/vm_page.c @@ -2995,9 +2995,7 @@ unlock: #define NRUNS 16 -CTASSERT(powerof2(NRUNS)); - -#define RUN_INDEX(count) ((count) & (NRUNS - 1)) +#define RUN_INDEX(count, nruns) ((count) % (nruns)) #define MIN_RECLAIM 8 @@ -3025,19 +3023,42 @@ CTASSERT(powerof2(NRUNS)); * must be a power of two. */ bool -vm_page_reclaim_contig_domain(int domain, int req, u_long npages, - vm_paddr_t low, vm_paddr_t high, u_long alignment, vm_paddr_t boundary) +vm_page_reclaim_contig_domain_ext(int domain, int req, u_long npages, + vm_paddr_t low, vm_paddr_t high, u_long alignment, vm_paddr_t boundary, + int desired_runs) { struct vm_domain *vmd; vm_paddr_t curr_low; - vm_page_t m_run, m_runs[NRUNS]; + vm_page_t m_run, _m_runs[NRUNS], *m_runs; u_long count, minalign, reclaimed; - int error, i, options, req_class; + int error, i, min_reclaim, nruns, options, req_class; + bool ret; KASSERT(npages > 0, ("npages is 0")); KASSERT(powerof2(alignment), ("alignment is not a power of 2")); KASSERT(powerof2(boundary), ("boundary is not a power of 2")); + ret = false; + + /* + * If the caller wants to reclaim multiple runs, try to allocate + * space to store the runs. If that fails, fall back to the old + * behavior of just reclaiming MIN_RECLAIM pages. + */ + if (desired_runs > 1) + m_runs = malloc((NRUNS + desired_runs) * sizeof(*m_runs), + M_TEMP, M_NOWAIT); + else + m_runs = NULL; + + if (m_runs == NULL) { + m_runs = _m_runs; + nruns = NRUNS; + } else { + nruns = NRUNS + desired_runs - 1; + } + min_reclaim = MAX(desired_runs * npages, MIN_RECLAIM); + /* * The caller will attempt an allocation after some runs have been * reclaimed and added to the vm_phys buddy lists. Due to limitations @@ -3066,7 +3087,7 @@ vm_page_reclaim_contig_domain(int domain, int req, u_long npages, if (count < npages + vmd->vmd_free_reserved || (count < npages + vmd->vmd_interrupt_free_min && req_class == VM_ALLOC_SYSTEM) || (count < npages && req_class == VM_ALLOC_INTERRUPT)) - return (false); + goto done; /* * Scan up to three times, relaxing the restrictions ("options") on @@ -3085,27 +3106,29 @@ vm_page_reclaim_contig_domain(int domain, int req, u_long npages, if (m_run == NULL) break; curr_low = VM_PAGE_TO_PHYS(m_run) + ptoa(npages); - m_runs[RUN_INDEX(count)] = m_run; + m_runs[RUN_INDEX(count, nruns)] = m_run; count++; } /* * Reclaim the highest runs in LIFO (descending) order until * the number of reclaimed pages, "reclaimed", is at least - * MIN_RECLAIM. Reset "reclaimed" each time because each + * "min_reclaim". Reset "reclaimed" each time because each * reclamation is idempotent, and runs will (likely) recur * from one scan to the next as restrictions are relaxed. */ reclaimed = 0; - for (i = 0; count > 0 && i < NRUNS; i++) { + for (i = 0; count > 0 && i < nruns; i++) { count--; - m_run = m_runs[RUN_INDEX(count)]; + m_run = m_runs[RUN_INDEX(count, nruns)]; error = vm_page_reclaim_run(req_class, domain, npages, m_run, high); if (error == 0) { reclaimed += npages; - if (reclaimed >= MIN_RECLAIM) - return (true); + if (reclaimed >= min_reclaim) { + ret = true; + goto done; + } } } @@ -3117,9 +3140,23 @@ vm_page_reclaim_contig_domain(int domain, int req, u_long npages, options = VPSC_NOSUPER; else if (options == VPSC_NOSUPER) options = VPSC_ANY; - else if (options == VPSC_ANY) - return (reclaimed != 0); + else if (options == VPSC_ANY) { + ret = reclaimed != 0; + goto done; + } } +done: + if (m_runs != _m_runs) + free(m_runs, M_TEMP); + return (ret); +} + +bool +vm_page_reclaim_contig_domain(int domain, int req, u_long npages, + vm_paddr_t low, vm_paddr_t high, u_long alignment, vm_paddr_t boundary) +{ + return (vm_page_reclaim_contig_domain_ext(domain, req, npages, low, high, + alignment, boundary, 1)); } bool diff --git a/sys/vm/vm_page.h b/sys/vm/vm_page.h index 9563f4ac714c..824a853fb0f7 100644 --- a/sys/vm/vm_page.h +++ b/sys/vm/vm_page.h @@ -668,6 +668,9 @@ bool vm_page_reclaim_contig(int req, u_long npages, vm_paddr_t low, vm_paddr_t high, u_long alignment, vm_paddr_t boundary); bool vm_page_reclaim_contig_domain(int domain, int req, u_long npages, vm_paddr_t low, vm_paddr_t high, u_long alignment, vm_paddr_t boundary); +bool vm_page_reclaim_contig_domain_ext(int domain, int req, u_long npages, + vm_paddr_t low, vm_paddr_t high, u_long alignment, vm_paddr_t boundary, + int desired_runs); void vm_page_reference(vm_page_t m); #define VPR_TRYFREE 0x01 #define VPR_NOREUSE 0x02