Improving NFS write performance by a factor of 2.
rwatson at FreeBSD.org
Mon Nov 19 04:41:05 PST 2007
On Sun, 18 Nov 2007, Kip Macy wrote:
> Could you do me a favor and submit this in the form of a PR and assign it to
> me? I'm not the most appropriate person for this but the main NFS developer
> is no longer working on FreeBSD and I don't want to see this dropped.
If you're thinking of Mohan, he only mostly worked on the client, not the
server. Jeff Roberson would probably be the best person to assign this to, as
he's worked most recently in the NFS server (pushing Giant off the VFS paths
and cleaning up Giant-related locking, whereas I had pushed it down to VFS
before VFS locking was done).
Robert N M Watson
University of Cambridge
> On Nov 18, 2007 12:11 PM, Bjorn Gronvall <bg at sics.se> wrote:
>> I'm not sure if people care about NFS write performance any longer but
>> if you do, please read on.
>> A problem with the current NFS server is that it does not cluster
>> writes, this in turn leads to really poor sequential-write
>> By enabling write clustering NFS write performance goes from
>> 26.6Mbyte/s to 54.3Mbyte/s or increases by a factor of 2. This is on a
>> SATA disk with write caching enabled (hw.ata.wc=1).
>> If write caching is disabled performance still goes up from 1.6Mbyte/s
>> to 5.8Mbyte/s (or by a factor of 3.6).
>> The attached patch (relative to current) makes the following changes:
>> 1/ Rearrange the code so that the same code can be used to detect both
>> sequential read and write access.
>> 2/ Merge in updates from vfs_vnops.c::sequential_heuristic.
>> 3/ Use double hashing in order to avoid hash-clustering in the nfsheur
>> table. This change also makes it possible to reduce "try" from 32
>> to 8.
>> 4/ Pack the nfsheur table more efficiently.
>> 5/ Tolerate reordered RPCs to some small amount (initially suggested
>> by Ellard and Seltzer).
>> 6/ Back-off from sequential access rather than immediately switching to
>> random access (Ellard and Seltzer).
>> 7/ To avoid starvation of the buffer pool call bwillwrite. The call is
>> issued after the VOP_WRITE in order to avoid additional reordering
>> of write operations.
>> 8/ sysctl variables vfs.nfsrv.cluster_writes and cluster_reads to
>> enable or disable clustering. vfs.nfsrv.reordered_io counts the
>> number of reordered RPCs.
>> 9/ In nfsrv_commit check for write errors and report them back to the
>> client. Also check if the RPC argument count is zero which means
>> that we must flush to the end of file according to the RFC.
>> 10/ Two earlier commits broke the write gathering support:
>> This change removed NQNFS stuff but left the NQNFS variable
>> notstarted. This resulted in NFS write gathering effectively
>> being permanently disabled (regardless if NFSv2 or NFSv3).
>> This change disabled write gathering (again) for NFSv3 although
>> this should be controlled by vfs.nfs.nfsrvw_procrastinate_v3 !=
>> Write gathering may still be useful with NFSv3 to put reordered write
>> RPCs into order, perhaps also for other reasons. This is now possible
>> The attached patch is for current but you will observe similar
>> improvements with earlier FreeBSD versions. If you would like to have
>> the same patch but for FreeBSD 5.x, 6.x or 7.0 please drop me a line.
>> _ _ ,_______________.
>> Bjorn Gronvall (Björn Grönvall) /_______________/|
>> Swedish Institute of Computer Science | ||
>> PO Box 1263, S-164 29 Kista, Sweden | Schroedingers ||
>> Email: bg at sics.se, Phone +46 -8 633 15 25 | Cat |/
>> Cellular +46 -70 768 06 35, Fax +46 -8 751 72 30 '---------------'
>> freebsd-current at freebsd.org mailing list
>> To unsubscribe, send any mail to "freebsd-current-unsubscribe at freebsd.org"
> freebsd-current at freebsd.org mailing list
> To unsubscribe, send any mail to "freebsd-current-unsubscribe at freebsd.org"
More information about the freebsd-current