How to invalidate NFS read cache?

Fri May 15 05:51:35 UTC 2009

On Tue, 12 May 2009, Robert Watson wrote:

> On Fri, 8 May 2009, Konrad Heuer wrote:
>
>> sporadically, I observe a strange but serious problem in our large NFS 
>> environment. NFS servers are Linux and OS X with StorNext/Xsan cluster 
>> filesystems, NFS clients Linux and FreeBSD.
>> 
>> NFS client A changes a file, but nfs client B (running on FreeBSD) does 
>> still see the old version. On the NFS server itself, everything looks fine.
>> 
>> Afaik the FreeBSD kernel invalidates the NFS read cache if file 
>> modification time on the server changed which should happen here but 
>> doesn't. Can I force FreeBSD (e.g. by sysctl setting) to read file buffers 
>> again unconditionally after vfs.nfs.access_cache_timeout seconds have 
>> passed?
>
> Hi Konrad:
>
> Normally, NFS clients implement open-to-close consistency, which dictates 
> that when a close() occurs on client A, all pending writes on the file should 
> be issued to the server before close() returns, so that a signal to client B 
> to open() the file can validate its cache before open() returns.
>
> This raises the following question: is client A closing the file, and is 
> client B then opening it?
>
> If not: relying on writes being visible on the client B before the close() on 
> A and a fresh open() on B is not guaranteed to work, although we can discuss 
> ways to improve behavior with respect to expectation.  Try modifying your 
> application and see if it gets the desired behavior, and then we can discuss 
> ways to improve what you're seeing.
>
> If you are: this is probably a bug in our caching and or issuing of NFS RPCs. 
> We cache both attribute and access data -- perhaps there is an open() path 
> where we issue neither RPC?  In the case of open, we likely should test for a 
> valid access cache entry, and if there is one, issue an attribute read, and 
> otherwise just issue an access check which will piggyback fresh attribute 
> data on the reply.  Perhaps there is a bug here somewhere.
>
> A few other misc questions:
>
> - Could you confirm you're using NFSv3 on all clients.  Are there any special
>  mount options in use?
> - What version of FreeBSD are you running with?
>
> In FreeBSD 8.x, we now have DTrace probes for all of the above events -- 
> VOPs, attribute cache hit/miss/load/flush, access cache hit/miss/load/flush, 
> RPCs, etc, which we can use to debug the problem.  I haven't yet MFC'd these 
> to 7.x, but if you're able to run a very fresh 7-STABLE, I can probably 
> produce a patch to add it for you in a few days.

Hello, Robert,

thank you very much for your reply!

The problem I observe happens with FreeBSD 6.4-R and 7.0-R with nfsv3. The 
fstab entry I use is:

server:/Volume /local/dir nfs bg,rw,intr,-T,-r32768,-w16384 0 0

The server runs on Mac OSX 10.5.

In the meantime, I had the chance to examine a failure a little bit 
closer. As far as I can see in the moment a file modified on a Linux NFS 
client gets a new modification time on the NFS server but the FreeBSD 
client still sees the old timestamp. This obviously happens sporadically 
only under some circumstances I do not know further. I'll do some further 
testing the next days.

Could you imagine a kind of directory or metadata caching on FreeBSD NFS 
clients that may cause this behaviour?

Best regards
Konrad

Konrad Heuer
GWDG, Am Fassberg, 37077 Goettingen, Germany, kheuer2 at gwdg.de