Is the NFS Replay Cache needed for correctness with TCP mounts?

Rick Macklem rmacklem at uoguelph.ca
Tue Mar 18 23:38:46 UTC 2014


Ryan Stone wrote:
> My understanding of the NFS replay cache is that it's used so that
> the
> NFS server can avoid trying non-idempotent requests twice if a it
> handles a retransmitted request (because the response to the first
> request was lost in transmit, for example).  Is this really only
> needed for UDP mounts?  I would expect TCP mounts to not have the
> problem because the TCP layer should handle the retransmits and the
> NFS code should never see the same request twice.  Is this correct?
> 
Well, even for TCP, a client can retry a non-idempotent RPC, if it
does not receive an RPC reply. Normally the timeout is much longer,
so it will take a network partitioning for some time to cause it.
(The timeout will vary with client, but I would expect it to be
 at least 1minute. I think the new FreeBSD client uses 5minutes.)
Most (although not all NFSv3) clients will do the retry of the RPC
on a new TCP connection.

As such, the question really becomes "How reliable is your network
interconnect?" and "How critical is file corruption on the server?".

However, as I mention below, I don't believe that the old/default
FreeBSD8 server uses the DRC for TCP.

> 
> I ask because I have an NFS server (using the default legacy NFS
> implementation) running FreeBSD 8.2 that is having problems with
> entries in the replay cache becoming badly corrupted, leading to mbuf
> leaks and system crashes.  I know that the NFS code has been
> rewritten
> as of FreeBSD 9 so hopefully the issue is fixed in future versions,
> but for the short term I'm not able to upgrade.  I control the
> clients
> and I know that they all use TCP mounts, so I was wondering if
> patching the server to disable the replay cache would be a plausible
> short-term workaround for the issue until I can upgrade, or if I'm
> courting disaster.
> 
As far as I know, the old/default NFS server in FreeBSD8 does not use
the DRC for TCP. (I added TCP support to the new server to try and improve
correctness.)

I have no idea why the replay cache would be doing anything if all the
mounts are using UDP, given the old NFS server.
(I'm not sure if "nfsstat -s" will list all RPCs as "Misses" or not list
 them at all, so I don't know if a non-zero "Misses" count indicates that
 the DRC is being used?)

rick

> Thanks,
> Ryan
> _______________________________________________
> freebsd-fs at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
> 


More information about the freebsd-fs mailing list