open() and ESTALE error

Don Lewis truckman at FreeBSD.org
Mon Jun 30 13:54:49 PDT 2003


On 29 Jun, I wrote:
> On 22 Jun, Andrey Alekseyev wrote:

>> Name cache can be purged by nfs_lookup(), if the latter finds that the
>> capability numbers doesn't match. In this case, nfs_lookup() will send a
>> new "lookup" RPC request to the server. Name cache can also be purged from
>> getnewvnode() and vclean(). Which code does that for the above scenario
>> it's quite obscure to me. Yes, my knowledge is limited :)
> 
> The vpid == newvp->v_id test in nfs_lookup() just detects if the vnode
> that the cache entry pointed to was recycled for another use while it
> was on the free list.  It doesn't detect whether the inode on the server
> was recycled.
> 
> When I was thinking about this problem, the solution I came up with was
> a lot like the
> 	if (!VOP_GETATTR(newvp, &vattr, cnp->cn_cred, td)
>                             && vattr.va_ctime.tv_sec == VTONFS(newvp)->n_ctime)
> code fragment, but I would have done the ctime check on both the target
> and the parent directory and only ignored the cache entry if both ctimes
> had been updated.  Checking only the target should be more conservative,
> though it would be slower because there would be more cases where the
> client would have to do the RPC call.

I actually meant to say the mtime of the parent directory.

After doing some more testing, I believe the problem I'm seeing is
caused by the rename on the server not updating the seconds field of the
file ctime.  If the file was last changed at time N, if the client does
a lookup on the file and sees this ctime value, and the server renames
the file before the time on the server increments to the next second,
the ctime check nfs_lookup() won't detect that the cached lookup
information might be invalid.

The best way I could think of to fix this problem is to ignore the cache
entry and do the lookup RPC until we detect that the time on the server
has incremented to the next second, so that we know that the cached
lookup must be valid.  The problem is that I don't know how to get a
timestamp from the server.


>> I've also done a number of tcpdump's for different test patterns and I
>> believe, what happens with the cached vnode may depend on the results of
>> the "access" RPC request to the server.
> 
> That may be an important clue.  The access cache may be properly
> working, but the attribute cache timeout may be broken.

I'm pretty sure that the problem that you are having with open()
returning ESTALE is caused by the difference between the access cache
timeout and the attribute cache timeout.  It looks like your workaround
of retrying the open only works with NFSv3 because NFSv2() relies on
VOP_GETATTR(), and if the attribute cache timeout is too long the open()
will succeed and you'll only detect the failure when you actually do the
I/O.



More information about the freebsd-hackers mailing list