cleaning files beyond EOF

Bruce Evans brde at optusnet.com.au
Sun Feb 17 07:02:04 UTC 2013


On Sun, 17 Feb 2013, Konstantin Belousov wrote:

> On Sun, Feb 17, 2013 at 11:33:58AM +1100, Bruce Evans wrote:
>> I have a (possibly damaged) ffs data block with nonzero data beyond
>> EOF.  Is anything responsible for clearing this data when the file
>> is mmapped()?
>>
>> At least old versions of gcc mmap() the file and have a bug checking
>> for EOF.  They read the garbage beyond the end and get confused.
>
> Does the 'damaged' status of the data block mean that it contain the
> garbage after EOF on disk ?

Yes, it's at most software damage.  I used a broken version of
vfs_bio_clrbuf() for a long time and it probably left some unusual
blocks.  This matters suprisingly rarely.

I forgot to mention that this is with an old version of FreeBSD,
where I changed vfs_bio.c a lot but barely touched vm.

> UFS uses a small wrapper around vnode_generic_getpages() as the
> VOP_GETPAGES(), the wrapping code can be ignored for the current
> purpose.
>
> vnode_generic_getpages() iterates over the the pages after the bstrategy()
> and marks the part of the page after EOF valid and zeroes it, using
> vm_page_set_valid_range().

The old version has a large non-wrapper in ffs, and vnode_generic_getpages()
uses vm_page_set_validclean().  Maybe the bug is just in the old
ffs_getpages().  It seems to do only DEV_BSIZE'ed zeroing stuff.  It
begins with the same "We have to zero that data" code that forms most
of the wrapper in the current version.  It normally only returns
vnode_pager_generic_getpages() after that if bsize < PAGE_SIZE.
However, my version has a variable which I had forgotten about to
control this, and the forgotten setting of this variable results in
always using vnode_pager_generic_getpages(), as in -current.  I probably
copied some fixes in -current for this.  So the bug can't be just in
ffs_getpages().

The "damaged" block is at the end of vfs_default.c.  The file size is
25 * PAGE_SIZE + 16.  It is in 7 16K blocks, 2 full 2K frags, and 1 frag
with 16 bytes valid in it.

I have another problem that is apparently with
vnode_pager_generic_getpages() and now affects -current from about a
year ago in an identical way with the old version: mmap() is very slow
in msdosfs.  cmp uses mmap() too much, and reading files sequentially
using mmap() is 3.4 times slower than reading them using read() on my
DVD media/drive.  The i/o seems to be correctly clustered for both.
with average transaction sizes over 50K but tps much lower for mmap().
Similarly on a (faster) hard disk except the slowness is not as noticeable
(drive buffering might hide it completely).  However, for ffs files on
the hard disk, mmap() is as fast as read().

Bruce


More information about the freebsd-fs mailing list