RFC: NFS client patch to reduce sychronous writes

Fri Nov 29 06:00:40 UTC 2013

> Date: Thu, 28 Nov 2013 09:18:21 +0200
> From: Konstantin Belousov <kostikbel at gmail.com>
> To: Kirk McKusick <mckusick at mckusick.com>
> Cc: Rick Macklem <rmacklem at uoguelph.ca>, FreeBSD FS <freebsd-fs at freebsd.org>
> Subject: Re: RFC: NFS client patch to reduce sychronous writes
> 
> On Wed, Nov 27, 2013 at 03:20:14PM -0800, Kirk McKusick wrote:
>> The ``fix'' of bzero'ing every buffer cache page was made to UFS/FFS
>> for this problem and it killed write performance of the filesystem
>> by nearly half. We corrected this by only doing the bzero when the
>> file is mmap'ed which helped things considerably (since most files
>> being written are not also bmap'ed).
> 
> I am not sure that I follow.
> 
> For UFS, leaving any part of the buffer with undefined garbage would
> cause the garbage to appear on the next mmap(2), since page in is
> implemented as translation of the file offsets into disk offsets and
> than reading disk blocks. The read always fetch full page. UFS cannot
> know if the file would be mapped sometime in future, or after the
> reboot.
> 
> In fact, UFS is quite plentiful WRT zeroing buffers on write. It is easy
> to see almost all places where it is done, by searching for BA_CLRBUF
> flag for UFS_BALLOC(). UFS does perform the optimization of _trying_ to
> not clear newly allocated buffer on write if uio covers the whole buffer
> range. Still, on error it falls back to clearing, which is performed by
> vfs_bio_clrbuf() call in ffs_write().

You are entirely correct in your analysis. The original "fix" was to always
clear every buffer even when it was being completely filled (which is the
most common case). I changed the filling completely case to first try the
copyin and only zeroing it when the copyin fails. Making that change nearly
doubled the the speed of bulk writes.

	~Kirk