Help with filing a [maybe] ZFS/mmap bug.

Andriy Gapon avg at FreeBSD.org
Sun Nov 17 09:49:33 UTC 2013


on 14/11/2013 13:48 Andriy Gapon said the following:
> HOWEVER.  I think that there is a bug that I introduced in r246293.
> Specifically I changed
> 	vm_page_undirty(pp);
> to
> 	pmap_remove_write(pp);
> 	vm_page_clear_dirty(pp, off, nbytes);
> 
> vm_page_undirty() would be a very serious (and probably obvious) bug, if it were
> not a NOP in effect.  The details are explained in the commit message.
> But when I used vm_page_clear_dirty I completely missed the fact that *extends*
> the range to DEV_BSIZE aligned boundaries[*].  So, given the described behavior
> and that pmap_remove_write clears the page modified bit, it is possible that the
> data dirty data in the extended areas will be marked as clean.

Here is a patch (for head) that should fix the described above issue:

diff --git a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c
b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c
index 2e2cbd6..4fcd571 100644
--- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c
+++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c
@@ -328,6 +328,20 @@ page_busy(vnode_t *vp, int64_t start, int64_t off, int64_t
nbytes)
 {
 	vm_object_t obj;
 	vm_page_t pp;
+	int64_t end;
+
+	/*
+	 * At present vm_page_clear_dirty extends the cleared range to DEV_BSIZE
+	 * aligned boundaries, if the range is not aligned.  As a result a
+	 * DEV_BSIZE subrange with partially dirty data may get marked as clean.
+	 * It may happen that all DEV_BSIZE subranges are marked clean and thus
+	 * the whole page would be considred clean despite have some dirty data.
+	 * For this reason we should shrink the range to DEV_BSIZE aligned
+	 * boundaries before calling vm_page_clear_dirty.
+	 */
+	end = rounddown2(off + nbytes, DEV_BSIZE);
+	off = roundup2(off, DEV_BSIZE);
+	nbytes = end - off;

 	obj = vp->v_object;
 	zfs_vmobject_assert_wlocked(obj);
@@ -362,7 +376,8 @@ page_busy(vnode_t *vp, int64_t start, int64_t off, int64_t
nbytes)
 			ASSERT3U(pp->valid, ==, VM_PAGE_BITS_ALL);
 			vm_object_pip_add(obj, 1);
 			pmap_remove_write(pp);
-			vm_page_clear_dirty(pp, off, nbytes);
+			if (nbytes != 0)
+				vm_page_clear_dirty(pp, off, nbytes);
 		}
 		break;
 	}

-- 
Andriy Gapon


More information about the freebsd-stable mailing list