Help with filing a [maybe] ZFS/mmap bug.
George Hartzell
hartzell at alerce.com
Sun Nov 17 20:58:35 UTC 2013
Andriy Gapon writes:
> on 14/11/2013 13:48 Andriy Gapon said the following:
> > HOWEVER. I think that there is a bug that I introduced in r246293.
> > Specifically I changed
> > vm_page_undirty(pp);
> > to
> > pmap_remove_write(pp);
> > vm_page_clear_dirty(pp, off, nbytes);
> >
> > vm_page_undirty() would be a very serious (and probably obvious) bug, if it were
> > not a NOP in effect. The details are explained in the commit message.
> > But when I used vm_page_clear_dirty I completely missed the fact that *extends*
> > the range to DEV_BSIZE aligned boundaries[*]. So, given the described behavior
> > and that pmap_remove_write clears the page modified bit, it is possible that the
> > data dirty data in the extended areas will be marked as clean.
>
> Here is a patch (for head) that should fix the described above issue:
>
> diff --git a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c
> b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c
> index 2e2cbd6..4fcd571 100644
> --- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c
> +++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c
> @@ -328,6 +328,20 @@ page_busy(vnode_t *vp, int64_t start, int64_t off, int64_t
> nbytes)
> {
> vm_object_t obj;
> vm_page_t pp;
> + int64_t end;
> +
> + /*
> + * At present vm_page_clear_dirty extends the cleared range to DEV_BSIZE
> + * aligned boundaries, if the range is not aligned. As a result a
> + * DEV_BSIZE subrange with partially dirty data may get marked as clean.
> + * It may happen that all DEV_BSIZE subranges are marked clean and thus
> + * the whole page would be considred clean despite have some dirty data.
> + * For this reason we should shrink the range to DEV_BSIZE aligned
> + * boundaries before calling vm_page_clear_dirty.
> + */
> + end = rounddown2(off + nbytes, DEV_BSIZE);
> + off = roundup2(off, DEV_BSIZE);
> + nbytes = end - off;
>
> obj = vp->v_object;
> zfs_vmobject_assert_wlocked(obj);
> @@ -362,7 +376,8 @@ page_busy(vnode_t *vp, int64_t start, int64_t off, int64_t
> nbytes)
> ASSERT3U(pp->valid, ==, VM_PAGE_BITS_ALL);
> vm_object_pip_add(obj, 1);
> pmap_remove_write(pp);
> - vm_page_clear_dirty(pp, off, nbytes);
> + if (nbytes != 0)
> + vm_page_clear_dirty(pp, off, nbytes);
> }
> break;
> }
>
> --
> Andriy Gapon
This fixes my "test case" (un-automated though it may be). I've
successfully run twice through using Picard (which uses Mutagen, which
uses mmap) on my 10-ALPHA-2 to tag and transcode a set of tracks that
used to consistently but nondeterministically result in failures.
It's a bit of a negative result, all that I can see is that I seem to
no longer see the problem. But I'm happier than I was and your logic
seems to match the reality I was experiencing.
Thanks!
g.
More information about the freebsd-stable
mailing list