RFC: NFS client patch to reduce sychronous writes

Rick Macklem rmacklem at uoguelph.ca
Tue Nov 26 23:41:15 UTC 2013


Hi,

The current NFS client does a synchronous write
to the server when a non-contiguous write to the
same buffer cache block occurs. This is done because
there is a single dirty byte range recorded in the
buf structure. This results in a lot of synchronous
writes for software builds (I believe it is the loader
that loves to write small non-contiguous chunks to
its output file). Some users disable synchronous
writing on the server to improve performance, but
this puts them at risk of data loss when the server
crashes.

Long ago jhb@ emailed me a small patch that avoided
the synchronous writes by simply making the dirty byte
range a superset of the bytes written. The problem
with doing this is that for a rare (possibly non-existent)
application that writes non-overlapping byte ranges
to the same file from multiple clients concurrently,
some of these writes might get lost by stale data in
the superset of the byte range being written back to
the server. (Crappy, run on sentence, but hopefully
it makes sense;-)

I created a patch that maintained a list of dirty byte
ranges. It was complicated and I found that the list
often had to be > 100 entries to avoid the synchronous
writes.

So, I think his solution is preferable, although I've
added a couple of tweaks:
- The synchronous writes (old/current algorithm) is still
  used if there has been file locking done on the file.
  (I think any app. that writes a file from multiple clients
   will/should use file locking.)
- The synchronous writes (old/current algorithm) is used
  if a sysctl is set. This will avoid breakage for any app.
  (if there is one) that writes a file from multiple clients
  without doing file locking.

For testing on my very slow single core hardware, I see about
a 10% improvement in kernel build times, but with fewer I/O
RPCs:
             Read RPCs  Write RPCs
old/current  50K        122K
patched      39K         40K
--> it reduced the Read RPC count by about 20% and cut the
    Write RPC count to 1/3rd.
I think jhb@ saw pretty good performance results with his patch.

Anyhow, the patch is attached and can also be found here:
  http://people.freebsd.org/~rmacklem/noncontig-write.patch

I'd like to invite folks to comment/review/test this patch,
since I think it is ready for head/current.

Thanks, rick
ps: Kostik, maybe you could look at it. In particular, I am
    wondering if I zero'd out the buffer the correct way, via
    vfs_bio_bzero_buf()?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: noncontig-write.patch
Type: text/x-patch
Size: 4671 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20131126/61e26f37/attachment-0001.bin>


More information about the freebsd-fs mailing list