git: c98367641991 - main - vm_fault: Defer marking COW pages valid
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sun, 13 Apr 2025 16:12:35 UTC
The branch main has been updated by markj:
URL: https://cgit.FreeBSD.org/src/commit/?id=c98367641991019bac0e8cd55b70682171820534
commit c98367641991019bac0e8cd55b70682171820534
Author: Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2025-04-13 16:09:31 +0000
Commit: Mark Johnston <markj@FreeBSD.org>
CommitDate: 2025-04-13 16:09:31 +0000
vm_fault: Defer marking COW pages valid
Suppose an object O has two shadow objects S1, S2 mapped into processes
P1, P2. Suppose a page resident in O is mapped read-only into P1. Now
suppose that P1 writes to the page, triggering a COW fault: it allocates
a new page in S1 and copies the page, then marks it valid. If the page
in O was busy when initially looked up, P1 would have to release the map
lock and sleep first. Then, after handling COW, P1 must re-check the
map lookup because locks were dropped. Suppose the map indeed changed,
so P1 has to retry the fault.
At this point, the mapped page in O is shadowed by a valid page in S1.
If P2 exits, S2 will be deallocated, resulting in a collapse of O into
S1. In this case, because the mapped page is shadowed, P2 will free it,
but that is illegal; this triggers a "freeing mapped page" assertion in
invariants kernels.
Fix the problem by deferring the vm_page_valid() call which marks the
COW copy valid: only mark it once we know that the fault handler will
succeed. It's okay to leave an invalid page in the top-level object; it
will be freed when the fault is retried, and vm_object_collapse_scan()
will similarly free invalid pages in the shadow object.
Reviewed by: kib
MFC after: 1 month
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D49758
---
sys/vm/vm_fault.c | 19 ++++++++++++++++---
1 file changed, 16 insertions(+), 3 deletions(-)
diff --git a/sys/vm/vm_fault.c b/sys/vm/vm_fault.c
index 81631b672040..0bd3a8207c4a 100644
--- a/sys/vm/vm_fault.c
+++ b/sys/vm/vm_fault.c
@@ -1064,14 +1064,14 @@ vm_fault_cow(struct faultstate *fs)
* Oh, well, lets copy it.
*/
pmap_copy_page(fs->m, fs->first_m);
- vm_page_valid(fs->first_m);
if (fs->wired && (fs->fault_flags & VM_FAULT_WIRE) == 0) {
vm_page_wire(fs->first_m);
vm_page_unwire(fs->m, PQ_INACTIVE);
}
/*
- * Save the cow page to be released after
- * pmap_enter is complete.
+ * Save the COW page to be released after pmap_enter is
+ * complete. The new copy will be marked valid when we're ready
+ * to map it.
*/
fs->m_cow = fs->m;
fs->m = NULL;
@@ -1759,6 +1759,19 @@ found:
if (hardfault)
fs.entry->next_read = vaddr + ptoa(ahead) + PAGE_SIZE;
+ /*
+ * If the page to be mapped was copied from a backing object, we defer
+ * marking it valid until here, where the fault handler is guaranteed to
+ * succeed. Otherwise we can end up with a shadowed, mapped page in the
+ * backing object, which violates an invariant of vm_object_collapse()
+ * that shadowed pages are not mapped.
+ */
+ if (fs.m_cow != NULL) {
+ KASSERT(vm_page_none_valid(fs.m),
+ ("vm_fault: page %p is already valid", fs.m_cow));
+ vm_page_valid(fs.m);
+ }
+
/*
* Page must be completely valid or it is not fit to
* map into user space. vm_pager_get_pages() ensures this.