From nobody Sun Apr 13 16:12:35 2025 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4ZbFnH5RMWz5tX5k; Sun, 13 Apr 2025 16:12:35 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4ZbFnH3NrPz3VXc; Sun, 13 Apr 2025 16:12:35 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1744560755; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=VVRwLF+nFuvIh4AoCZa9+b+s90paNbZBQsLkpGTFXWg=; b=hfI3M4XijcBFrry0MsEySZ7s/xMkEKs8MTbCLNn5dMTii1XaIzd8yipVQyIC+3iUwYJZ4E T97bNinr7GsIruguRurGZnKpVOTG4wK/BwHs6SIy1D76UHfg/7/l408Nj44Luh45ue345b JI4GeQJ80E1D+0FO9nadgyWscKvqxowGuiiHWYfM4kBRTlcjyyWtzX6XQyjduz8EQAIIJT +9rIZqtEkjFZYwztskdlEA64+7gqm3TXyNRmlYbkMI/AxTfj2F7WrbhHcJi1MpDS6ZfA15 CJTAoZ7R75F6wwwb9mYuxKHiRSV3IxNMNWmwHOAQI3bGenbgIkknLRtSk4RGUQ== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1744560755; a=rsa-sha256; cv=none; b=Gz2GIJ+u/VStlOOYD2it6XoIVts94FjYW8aTgtWzGuw6tVX7+T2dPZ3dc1HoiVPAt7xoOn lui8fZy1xl/lKAshKQiduxK/j409Qe1Ztubr4K7ga1BlFZkSz2aZ8zC7r6tWa4KIqTms4d 90mL5ASf9DhFXCkytQC6IYILFdnSjADxj7TJiKgP44bRSyOJODvGORwx3HPORhNF4f5x2D xxH4jJkfQza49dGFJ3c85nCzGinv8b1JCWlgm2nSe4zB9F6dYHo9CBfr4QT+vFLQxf6gW4 HxO17wYRqU9kKjJC4rATOl/tvilqtaF7DAvvdZlkySm5IA62ZnXy8YuTHugApA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1744560755; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=VVRwLF+nFuvIh4AoCZa9+b+s90paNbZBQsLkpGTFXWg=; b=q3TJSxgGXrfIOMrabusOT/PalCLJfTEyTrK9oFGF3TxPQ3UoiPilNwA91qtpgXrFJM5Gt0 lBFDHDn1oSwPFh0lXHG6Ccs1xX0G+C3UB0idenRtB+Wvz5o4K3l2zs8JcateohzNITPdlf htx/WWadU4V0Prra5W25woi3mvD2366jdFTR3jEhOtcSyA0xjTfmz9wEfQDSdDDR+cE2nQ smGX2MJ3rjzyA0Gs6biUECmY93/Y9Vy93QNYvx0JkBLEuKUwErr4kW/3O9BoJBxt08exkM VZAcq9maYqEFYdeD6h0BoOcpbvEf3fxdkx4f7vRzcfalGcEQyLwNzVQDOVvXJw== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4ZbFnH2trQz18Fn; Sun, 13 Apr 2025 16:12:35 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 53DGCZlg083055; Sun, 13 Apr 2025 16:12:35 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 53DGCZEl083052; Sun, 13 Apr 2025 16:12:35 GMT (envelope-from git) Date: Sun, 13 Apr 2025 16:12:35 GMT Message-Id: <202504131612.53DGCZEl083052@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Mark Johnston Subject: git: c98367641991 - main - vm_fault: Defer marking COW pages valid List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-main@freebsd.org Sender: owner-dev-commits-src-main@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: markj X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: c98367641991019bac0e8cd55b70682171820534 Auto-Submitted: auto-generated The branch main has been updated by markj: URL: https://cgit.FreeBSD.org/src/commit/?id=c98367641991019bac0e8cd55b70682171820534 commit c98367641991019bac0e8cd55b70682171820534 Author: Mark Johnston AuthorDate: 2025-04-13 16:09:31 +0000 Commit: Mark Johnston CommitDate: 2025-04-13 16:09:31 +0000 vm_fault: Defer marking COW pages valid Suppose an object O has two shadow objects S1, S2 mapped into processes P1, P2. Suppose a page resident in O is mapped read-only into P1. Now suppose that P1 writes to the page, triggering a COW fault: it allocates a new page in S1 and copies the page, then marks it valid. If the page in O was busy when initially looked up, P1 would have to release the map lock and sleep first. Then, after handling COW, P1 must re-check the map lookup because locks were dropped. Suppose the map indeed changed, so P1 has to retry the fault. At this point, the mapped page in O is shadowed by a valid page in S1. If P2 exits, S2 will be deallocated, resulting in a collapse of O into S1. In this case, because the mapped page is shadowed, P2 will free it, but that is illegal; this triggers a "freeing mapped page" assertion in invariants kernels. Fix the problem by deferring the vm_page_valid() call which marks the COW copy valid: only mark it once we know that the fault handler will succeed. It's okay to leave an invalid page in the top-level object; it will be freed when the fault is retried, and vm_object_collapse_scan() will similarly free invalid pages in the shadow object. Reviewed by: kib MFC after: 1 month Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D49758 --- sys/vm/vm_fault.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/sys/vm/vm_fault.c b/sys/vm/vm_fault.c index 81631b672040..0bd3a8207c4a 100644 --- a/sys/vm/vm_fault.c +++ b/sys/vm/vm_fault.c @@ -1064,14 +1064,14 @@ vm_fault_cow(struct faultstate *fs) * Oh, well, lets copy it. */ pmap_copy_page(fs->m, fs->first_m); - vm_page_valid(fs->first_m); if (fs->wired && (fs->fault_flags & VM_FAULT_WIRE) == 0) { vm_page_wire(fs->first_m); vm_page_unwire(fs->m, PQ_INACTIVE); } /* - * Save the cow page to be released after - * pmap_enter is complete. + * Save the COW page to be released after pmap_enter is + * complete. The new copy will be marked valid when we're ready + * to map it. */ fs->m_cow = fs->m; fs->m = NULL; @@ -1759,6 +1759,19 @@ found: if (hardfault) fs.entry->next_read = vaddr + ptoa(ahead) + PAGE_SIZE; + /* + * If the page to be mapped was copied from a backing object, we defer + * marking it valid until here, where the fault handler is guaranteed to + * succeed. Otherwise we can end up with a shadowed, mapped page in the + * backing object, which violates an invariant of vm_object_collapse() + * that shadowed pages are not mapped. + */ + if (fs.m_cow != NULL) { + KASSERT(vm_page_none_valid(fs.m), + ("vm_fault: page %p is already valid", fs.m_cow)); + vm_page_valid(fs.m); + } + /* * Page must be completely valid or it is not fit to * map into user space. vm_pager_get_pages() ensures this.