From nobody Sun Oct 09 15:21:18 2022 X-Original-To: dev-commits-src-branches@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Mlm3L66Jwz4dtBw; Sun, 9 Oct 2022 15:21:18 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Mlm3L5fJYz3YGX; Sun, 9 Oct 2022 15:21:18 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1665328878; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=1+lKy7JsjD6WxG2e/ZAyfxLtFhm6cpl1juePNgby218=; b=XguSeBpw7mCcZWE0X62IhjpZi3BealrrbDBufE+MjjkO84x7D7TTky9geTQzlDSKkmSz3R 8u0kgmFGdpuTaFmaHdBm3Pnvyr/QD9csSzuLImrhNtqtU4NLwVp3Kx3EdEnBgd4Uf5RPS7 41AOMOZrd85PYFhM4T6NljKsI/26V2wX58OmjA+lILEOlckDMQbl8C/k9/GRL7FT5uy/Js PDc0EG5C4SVFo7FxLFFfh5xLgpYct+bSbonJbWTCXx4nMvD0dx8aYd2zCSvRioVGeg1zV3 1zgFk/IseMpyv9SfaNrmXw/b+IUYZyZSwdGzYAP7M2XFLze+IbNaTFuHu62SSA== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4Mlm3L4jMPzK67; Sun, 9 Oct 2022 15:21:18 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 299FLIhQ024522; Sun, 9 Oct 2022 15:21:18 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 299FLItZ024521; Sun, 9 Oct 2022 15:21:18 GMT (envelope-from git) Date: Sun, 9 Oct 2022 15:21:18 GMT Message-Id: <202210091521.299FLItZ024521@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-branches@FreeBSD.org From: Mark Johnston Subject: git: 8bebdbe494f6 - stable/13 - amd64: Make it possible to grow the KERNBASE region of KVA List-Id: Commits to the stable branches of the FreeBSD src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-branches List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-branches@freebsd.org X-BeenThere: dev-commits-src-branches@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: markj X-Git-Repository: src X-Git-Refname: refs/heads/stable/13 X-Git-Reftype: branch X-Git-Commit: 8bebdbe494f6909221e324ec5c13700dfd30cb5e Auto-Submitted: auto-generated ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1665328878; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=1+lKy7JsjD6WxG2e/ZAyfxLtFhm6cpl1juePNgby218=; b=o9a9jUhl47jvJ5KiueExBVcv7y7aJghiS+Z079GIsFgNb2FuybzCZMS4/XchvXWvFPkbBc vYlR1e65jBq3hIXmo+TgLfe8/0Gfj4I0fqtJGFf8oQSSonZB/EhPeZ1Q3LuwMYAy2XlbTB QZ0KIBric4El0wfy8LtmKXh1wLTkVwIjMVARc24msryNaBPmbpbAFGi/TA1ISh8EhIU3bx xgqCarkdofJ2/b3MLs7DvkaazjOWvvoGDfV+EEJl8OjFB0B2K8M6krT4FqvmShHzn2UUMZ JpjpIMOzEAYjNUtoZtJ7jbweZDDxNtIgOJa6q9ZbCDUiV7lbUPIQdHBoYrcCyw== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1665328878; a=rsa-sha256; cv=none; b=QjOzuSH8KoRvQUU4xZ4WeTAG6kASnnCFAbK4f9DzECSYUoRknz52VRvbt2kPvb5lu8laHj pXa0H5xjAinnK9RNTjIXdlyY/3AcILHNY7MnSxgr5IPAJTS3N4UhNgHvnAu+y3K0NPKwdW YYJiGoSd+mSZdqTiXrJMb+z6Z1pfLJ8Xx/dUHarhEoUjVml7o7Ctp2DnKsrySsL3P3EaOa 1n2MsdpsSyjQtZ3y4B4/z+DAU3CbgdhvQKCAMnVbgUPV70Wpu4/vERIvCf4XTFJZYs6DT2 QDcmtCqFJ2/7lfT3BTNK5CiWXhCPL0Bl9DjOhpNlSjxnwPSTKOJiBkAHPcByDA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N The branch stable/13 has been updated by markj: URL: https://cgit.FreeBSD.org/src/commit/?id=8bebdbe494f6909221e324ec5c13700dfd30cb5e commit 8bebdbe494f6909221e324ec5c13700dfd30cb5e Author: Mark Johnston AuthorDate: 2022-09-24 13:19:21 +0000 Commit: Mark Johnston CommitDate: 2022-10-09 15:21:10 +0000 amd64: Make it possible to grow the KERNBASE region of KVA pmap_growkernel() may be called when mapping a region above KERNBASE, typically for a kernel module. If we have enough PTPs left over from bootstrap, pmap_growkernel() does nothing. However, it's possible to run out, and in this case pmap_growkernel() will try to grow the kernel map all the way from kernel_vm_end to somewhere past KERNBASE, which can easily run the system out of memory. This happens with large kernel modules such as the nvidia GPU driver. There is also a WIP dtrace provider which needs to map KVA in the region above KERNBASE (to provide trampolines which allow a copy of traced kernel instruction to be executed), and its allocations could potentially trigger this scenario. This change modifies pmap_growkernel() to manage the two regions separately, allowing them to grow independently. The end of the KERNBASE region is tracked by modifying "nkpt". PR: 265019 Reviewed by: alc, imp, kib (cherry picked from commit 0b29f5efcc7ee8271ad2f6b6447898b489d618ec) --- sys/amd64/amd64/pmap.c | 65 ++++++++++++++++++++++++++++++++++---------------- 1 file changed, 44 insertions(+), 21 deletions(-) diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c index 6348f4c7acf0..548b7d66dd2b 100644 --- a/sys/amd64/amd64/pmap.c +++ b/sys/amd64/amd64/pmap.c @@ -4880,13 +4880,21 @@ pmap_growkernel(vm_offset_t addr) vm_page_t nkpg; pd_entry_t *pde, newpdir; pdp_entry_t *pdpe; + vm_offset_t end; mtx_assert(&kernel_map->system_mtx, MA_OWNED); /* - * Return if "addr" is within the range of kernel page table pages - * that were preallocated during pmap bootstrap. Moreover, leave - * "kernel_vm_end" and the kernel page table as they were. + * The kernel map covers two distinct regions of KVA: that used + * for dynamic kernel memory allocations, and the uppermost 2GB + * of the virtual address space. The latter is used to map the + * kernel and loadable kernel modules. This scheme enables the + * use of a special code generation model for kernel code which + * takes advantage of compact addressing modes in machine code. + * + * Both regions grow upwards; to avoid wasting memory, the gap + * in between is unmapped. If "addr" is above "KERNBASE", the + * kernel's region is grown, otherwise the kmem region is grown. * * The correctness of this action is based on the following * argument: vm_map_insert() allocates contiguous ranges of the @@ -4898,20 +4906,31 @@ pmap_growkernel(vm_offset_t addr) * any new kernel page table pages between "kernel_vm_end" and * "KERNBASE". */ - if (KERNBASE < addr && addr <= KERNBASE + nkpt * NBPDR) - return; + if (KERNBASE < addr) { + end = KERNBASE + nkpt * NBPDR; + if (end == 0) + return; + } else { + end = kernel_vm_end; + } addr = roundup2(addr, NBPDR); if (addr - 1 >= vm_map_max(kernel_map)) addr = vm_map_max(kernel_map); - if (kernel_vm_end < addr) - kasan_shadow_map(kernel_vm_end, addr - kernel_vm_end); - while (kernel_vm_end < addr) { - pdpe = pmap_pdpe(kernel_pmap, kernel_vm_end); + if (addr <= end) { + /* + * The grown region is already mapped, so there is + * nothing to do. + */ + return; + } + + kasan_shadow_map(end, addr - end); + while (end < addr) { + pdpe = pmap_pdpe(kernel_pmap, end); if ((*pdpe & X86_PG_V) == 0) { - /* We need a new PDP entry */ nkpg = pmap_alloc_pt_page(kernel_pmap, - kernel_vm_end >> PDPSHIFT, VM_ALLOC_WIRED | + pmap_pdpe_pindex(end), VM_ALLOC_WIRED | VM_ALLOC_INTERRUPT | VM_ALLOC_ZERO); if (nkpg == NULL) panic("pmap_growkernel: no memory to grow kernel"); @@ -4920,31 +4939,35 @@ pmap_growkernel(vm_offset_t addr) X86_PG_A | X86_PG_M); continue; /* try again */ } - pde = pmap_pdpe_to_pde(pdpe, kernel_vm_end); + pde = pmap_pdpe_to_pde(pdpe, end); if ((*pde & X86_PG_V) != 0) { - kernel_vm_end = (kernel_vm_end + NBPDR) & ~PDRMASK; - if (kernel_vm_end - 1 >= vm_map_max(kernel_map)) { - kernel_vm_end = vm_map_max(kernel_map); + end = (end + NBPDR) & ~PDRMASK; + if (end - 1 >= vm_map_max(kernel_map)) { + end = vm_map_max(kernel_map); break; } continue; } - nkpg = pmap_alloc_pt_page(kernel_pmap, - pmap_pde_pindex(kernel_vm_end), VM_ALLOC_WIRED | - VM_ALLOC_INTERRUPT | VM_ALLOC_ZERO); + nkpg = pmap_alloc_pt_page(kernel_pmap, pmap_pde_pindex(end), + VM_ALLOC_WIRED | VM_ALLOC_INTERRUPT | VM_ALLOC_ZERO); if (nkpg == NULL) panic("pmap_growkernel: no memory to grow kernel"); paddr = VM_PAGE_TO_PHYS(nkpg); newpdir = paddr | X86_PG_V | X86_PG_RW | X86_PG_A | X86_PG_M; pde_store(pde, newpdir); - kernel_vm_end = (kernel_vm_end + NBPDR) & ~PDRMASK; - if (kernel_vm_end - 1 >= vm_map_max(kernel_map)) { - kernel_vm_end = vm_map_max(kernel_map); + end = (end + NBPDR) & ~PDRMASK; + if (end - 1 >= vm_map_max(kernel_map)) { + end = vm_map_max(kernel_map); break; } } + + if (end <= KERNBASE) + kernel_vm_end = end; + else + nkpt = howmany(end - KERNBASE, NBPDR); } /***************************************************