Help with filing a [maybe] ZFS/mmap bug.
Andriy Gapon
avg at FreeBSD.org
Mon Nov 18 09:39:00 UTC 2013
on 18/11/2013 11:24 Jan Mikkelsen said the following:
>
> On 17 Nov 2013, at 8:48 pm, Andriy Gapon <avg at freebsd.org> wrote:
>
>> on 14/11/2013 13:48 Andriy Gapon said the following:
>>> HOWEVER. I think that there is a bug that I introduced in r246293.
>>> Specifically I changed
>>> vm_page_undirty(pp);
>>> to
>>> pmap_remove_write(pp);
>>> vm_page_clear_dirty(pp, off, nbytes);
>>>
>>> vm_page_undirty() would be a very serious (and probably obvious) bug, if it were
>>> not a NOP in effect. The details are explained in the commit message.
>>> But when I used vm_page_clear_dirty I completely missed the fact that *extends*
>>> the range to DEV_BSIZE aligned boundaries[*]. So, given the described behavior
>>> and that pmap_remove_write clears the page modified bit, it is possible that the
>>> data dirty data in the extended areas will be marked as clean.
>>
>> Here is a patch (for head) that should fix the described above issue:
>>
>> diff --git a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c
>> b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c
>> index 2e2cbd6..4fcd571 100644
>> --- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c
>> +++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c
>> @@ -328,6 +328,20 @@ page_busy(vnode_t *vp, int64_t start, int64_t off, int64_t
>> nbytes)
>> {
>> vm_object_t obj;
>> vm_page_t pp;
>> + int64_t end;
>> +
>> + /*
>> + * At present vm_page_clear_dirty extends the cleared range to DEV_BSIZE
>> + * aligned boundaries, if the range is not aligned. As a result a
>> + * DEV_BSIZE subrange with partially dirty data may get marked as clean.
>> + * It may happen that all DEV_BSIZE subranges are marked clean and thus
>> + * the whole page would be considred clean despite have some dirty data.
>> + * For this reason we should shrink the range to DEV_BSIZE aligned
>> + * boundaries before calling vm_page_clear_dirty.
>> + */
>> + end = rounddown2(off + nbytes, DEV_BSIZE);
>> + off = roundup2(off, DEV_BSIZE);
>> + nbytes = end - off;
>>
>> obj = vp->v_object;
>> zfs_vmobject_assert_wlocked(obj);
>> @@ -362,7 +376,8 @@ page_busy(vnode_t *vp, int64_t start, int64_t off, int64_t
>> nbytes)
>> ASSERT3U(pp->valid, ==, VM_PAGE_BITS_ALL);
>> vm_object_pip_add(obj, 1);
>> pmap_remove_write(pp);
>> - vm_page_clear_dirty(pp, off, nbytes);
>> + if (nbytes != 0)
>> + vm_page_clear_dirty(pp, off, nbytes);
>> }
>> break;
>> }
>
>
> 9.2 does not seem to have a rounddown2() macro.
Thanks for the heads-up!
You could use a plain rounddown() or just 'x & ~(DEV_BSIZE - 1)'.
--
Andriy Gapon
More information about the freebsd-stable
mailing list