From nobody Thu May 04 15:40:14 2023 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4QBygg2ZNLz49F0l; Thu, 4 May 2023 15:40:15 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4QBygg1qmxz3Qmj; Thu, 4 May 2023 15:40:15 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1683214815; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=HcNDLYvz0xa55SOD73YiTKQ66JqErN5rerrbNfZyUIk=; b=gg1QkUCfibY9pnwL95r1fg8SxY6wpfNa0jvDW/oCG5pnHI/HaCcwjpsVNmv5bavPDb4sqA uqGGjLkhMUXB5EZmiisN2PfXYI5TCfF9sVmJCnF7TBOENlwsQKRWchQny2oxWCfPm5tB3G pgxVE56/cPxVsh0ddRJhX01PkO7jKTdh6+ve0rIS/v7Z1Yvjs4EqljA1IwRjls8sz5hgBW q9wS4fuH8fhFnWnaAwUaUBV5rrEaFFGL4NYg7pwvHW31tVqaoPYd71Nbul+LqlmGFpQRxF rk7xPLDZfiN8w1pjEDlhcUwKuk8Qnlt+fpgv6zUJk4CG1SXo6P6jJCPCQAUicQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1683214815; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=HcNDLYvz0xa55SOD73YiTKQ66JqErN5rerrbNfZyUIk=; b=byAUtH3xUNPsS+/k9g4PsPbj6nN3/F/up+m/5TrncSDzB3IL3DwAdVcChej9/70dxBiTCB jZwYl/8tINRuBAUNnHHrqqrKwuTO7WhM6XH4UzEkzWxw3gWhyheC62Zzg803PyD+KS2JNa xgJS9m6WSZ6xE19qICn+GpaAOzwROEFhwGAV426tgKlhMEVwRiIsnTT+wmzuGfnq9Ad/50 VBFODqsD3wRWAvKpug8C0/p3oFnuOdRuyaaBm43pFXVaKsB/2iI24UAVA+DaK1jzXP3APx cuAXAJXNu8jG3HZUlH6bjYSxabmgnYNIRbKGbPZepbry7Z0DClge9cAj6bQlPg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1683214815; a=rsa-sha256; cv=none; b=ono3CcRNm3M5wk//QP/8EM1ktGKm+iKzvYLIbkFZfOPP0NUQdM2NOK2HMTLtihoevjDuVp FM67xXdAYi1d6z6eUB6Ex5vvOWGcoRAu2tm132FWkXDAMGR8lwnbL1MPRB2yHMQWUokByo +RlU1wozB2zYVt4NV2V5JAV/1ldxUuP1iArurqVe+7s6Mo0SxUVhmHyYsImxzu4dfE9aUO W5xUATwAEPBjEFZ4XT4GZyZh3rg4QtVCaVrCN82JbohjFJ+Z2Yt+Zzh93mY1EiCs6cQw9I OTnYmHHTSnhrxtuVT49iZYu/GQB5LO6VIlHAtZsmSrRg4jFPZgEvh3pVuUIyGQ== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4QBygf6tQjzHhM; Thu, 4 May 2023 15:40:14 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 344FeEP3059940; Thu, 4 May 2023 15:40:14 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 344FeETu059928; Thu, 4 May 2023 15:40:14 GMT (envelope-from git) Date: Thu, 4 May 2023 15:40:14 GMT Message-Id: <202305041540.344FeETu059928@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Konstantin Belousov Subject: git: af1c6d3f3013 - main - amd64: do not leak pcpu pages List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-main@freebsd.org X-BeenThere: dev-commits-src-main@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: kib X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: af1c6d3f3013062370692c8e1e9c87bb138fbbd9 Auto-Submitted: auto-generated X-ThisMailContainsUnwantedMimeParts: N The branch main has been updated by kib: URL: https://cgit.FreeBSD.org/src/commit/?id=af1c6d3f3013062370692c8e1e9c87bb138fbbd9 commit af1c6d3f3013062370692c8e1e9c87bb138fbbd9 Author: Konstantin Belousov AuthorDate: 2023-05-03 09:41:46 +0000 Commit: Konstantin Belousov CommitDate: 2023-05-04 15:39:22 +0000 amd64: do not leak pcpu pages Do not preallocate pcpu area backing pages on early startup, only allocate enough of KVA for pcpu[MAXCPU] and the page for BSP. Other pages are allocated after we know the number of cpus and their assignments to the domains. PCPUs are not accessed until they are initialized, which happens on AP startup. Reviewed by: markj Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D39945 --- sys/amd64/amd64/mp_machdep.c | 52 ++++++++++++++++++++------------------------ sys/amd64/amd64/pmap.c | 17 ++++++++++----- 2 files changed, 34 insertions(+), 35 deletions(-) diff --git a/sys/amd64/amd64/mp_machdep.c b/sys/amd64/amd64/mp_machdep.c index f6c3446e9981..5fdde0bb887d 100644 --- a/sys/amd64/amd64/mp_machdep.c +++ b/sys/amd64/amd64/mp_machdep.c @@ -290,29 +290,32 @@ init_secondary(void) init_secondary_tail(); } -/******************************************************************* - * local functions and data - */ - -#ifdef NUMA static void -mp_realloc_pcpu(int cpuid, int domain) +amd64_mp_alloc_pcpu(void) { vm_page_t m; - vm_offset_t oa, na; - - oa = (vm_offset_t)&__pcpu[cpuid]; - if (vm_phys_domain(pmap_kextract(oa)) == domain) - return; - m = vm_page_alloc_noobj_domain(domain, 0); - if (m == NULL) - return; - na = PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m)); - pagecopy((void *)oa, (void *)na); - pmap_qenter((vm_offset_t)&__pcpu[cpuid], &m, 1); - /* XXX old pcpu page leaked. */ -} + int cpu; + + /* Allocate pcpu areas to the correct domain. */ + for (cpu = 1; cpu < mp_ncpus; cpu++) { +#ifdef NUMA + m = NULL; + if (vm_ndomains > 1) { + m = vm_page_alloc_noobj_domain( + acpi_pxm_get_cpu_locality(cpu_apic_ids[cpu]), 0); + } + if (m == NULL) #endif + m = vm_page_alloc_noobj(0); + if (m == NULL) + panic("cannot alloc pcpu page for cpu %d", cpu); + pmap_qenter((vm_offset_t)&__pcpu[cpu], &m, 1); + } +} + +/******************************************************************* + * local functions and data + */ /* * start each AP in our list @@ -330,6 +333,7 @@ start_all_aps(void) int apic_id, cpu, domain, i; u_char mpbiosreason; + amd64_mp_alloc_pcpu(); mtx_init(&ap_boot_mtx, "ap boot", NULL, MTX_SPIN); MPASS(bootMP_size <= PAGE_SIZE); @@ -403,16 +407,6 @@ start_all_aps(void) outb(CMOS_REG, BIOS_RESET); outb(CMOS_DATA, BIOS_WARM); /* 'warm-start' */ - /* Relocate pcpu areas to the correct domain. */ -#ifdef NUMA - if (vm_ndomains > 1) - for (cpu = 1; cpu < mp_ncpus; cpu++) { - apic_id = cpu_apic_ids[cpu]; - domain = acpi_pxm_get_cpu_locality(apic_id); - mp_realloc_pcpu(cpu, domain); - } -#endif - /* start each AP */ domain = 0; for (cpu = 1; cpu < mp_ncpus; cpu++) { diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c index 55086125fbb9..1009736472dc 100644 --- a/sys/amd64/amd64/pmap.c +++ b/sys/amd64/amd64/pmap.c @@ -1902,7 +1902,7 @@ pmap_bootstrap(vm_paddr_t *firstaddr) vm_offset_t va; pt_entry_t *pte, *pcpu_pte; struct region_descriptor r_gdt; - uint64_t cr4, pcpu_phys; + uint64_t cr4, pcpu0_phys; u_long res; int i; @@ -1917,7 +1917,7 @@ pmap_bootstrap(vm_paddr_t *firstaddr) */ create_pagetables(firstaddr); - pcpu_phys = allocpages(firstaddr, MAXCPU); + pcpu0_phys = allocpages(firstaddr, 1); /* * Add a physical memory segment (vm_phys_seg) corresponding to the @@ -1995,10 +1995,15 @@ pmap_bootstrap(vm_paddr_t *firstaddr) SYSMAP(struct pcpu *, pcpu_pte, __pcpu, MAXCPU); virtual_avail = va; - for (i = 0; i < MAXCPU; i++) { - pcpu_pte[i] = (pcpu_phys + ptoa(i)) | X86_PG_V | X86_PG_RW | - pg_g | pg_nx | X86_PG_M | X86_PG_A; - } + /* + * Map the BSP PCPU now, the rest of the PCPUs are mapped by + * amd64_mp_alloc_pcpu()/start_all_aps() when we know the + * number of CPUs and NUMA affinity. + */ + pcpu_pte[0] = pcpu0_phys | X86_PG_V | X86_PG_RW | pg_g | pg_nx | + X86_PG_M | X86_PG_A; + for (i = 1; i < MAXCPU; i++) + pcpu_pte[i] = 0; /* * Re-initialize PCPU area for BSP after switching.