NFS client UDP retransmit timer busted for 8.n/9.n (patch)
John Baldwin
jhb at freebsd.org
Tue Dec 20 18:10:26 UTC 2011
On Sunday, December 18, 2011 7:59:13 pm Rick Macklem wrote:
> Thanks to recent detective work done by jwd@, a problem w.r.t.
> retransmit timeouts for UDP mounts (both old and new NFS clients)
> has been identified.
>
> The kernel rpc has two timeouts for UDP:
> 1 - a timeout that causes the RPC request to be retransmitted on
> the same socket, using the same xid. This one defaults to
> 3seconds and can be set via CLSET_RETRY_TIMEOUT.
> (This is always the default of 3seconds for FreeBSD currently.)
> 2 - a timeout that cause the socket to be destroyed and a fresh
> one created. The request is then sent on this new socket, with
> a different xid.
>
> The problem with #2 is that the retransmitted RPC request will miss
> a server's Duplicate Request Cache (DRC), because of the different xid.
> As such, #2 should be much larger than #1. However, #2 defaults to 1second
> (ie. smaller than #1->trouble!)
>
> One way to avoid this problem is to set #2 to a much larger value via the
> "timeout=<value>" mount option. (Btw, the <value> is in 1/10 seconds, so
> "timeout="600" sets it to 60sec.)
>
> I now have a patch that I believe deals with this correctly. It sets #1
> to the "timeout=<value>" (default 1second) and #2 to a much larger value.
> (#2 timeouts are what the kernel rpc counts as retries, so for "soft"
> mounts, I set #2 to "nm_retry * nm_timeout / 2" and "retries = 2", so
> that it fails after "nm_retry * nm_timeout", which I think is the correct
> semantics.)
> This patch is attached and is also available at:
> http://people.freebsd.org/~rmacklem/udp-timer.patch
> (jwd@, this patch is updated from what I emailed you, so you probably want it:-)
>
> In summary, if you are using NFS mounts over UDP on FreeBSD8 or 9 systems, you
> either want to use "timeout=600" or try the patch. You are pretty badly broken
> otherwise.
>
> Hopefully, this patch can make it into -current/head soon, rick
> ps: jhb@, could you maybe review this, thanks, rick.
It looks ok to me from what I can tell. I definitely agree that you want #2
to be much larger than #1, and I'll defer to you on the details of how to
divide nm_timeo up, etc. I do think 'nm_retry * nm_timeout' is the timeout
people expect for a soft mount.
--
John Baldwin
More information about the freebsd-fs
mailing list