open() and ESTALE error

Andrey Alekseyev uitm at blackflag.ru
Fri Jun 20 12:20:17 PDT 2003


Terry,

> The place to correct this is probably the underlying FS.  I'd
> argue that getting ESTALE is a poke with a sharp stick that
> makes this more likely to happen.  ;^).

Initially I was going to "fix" the underlying FS (that is, the NFS code).
But it's extremely hard to do "nice", because I need to re-lookup the name(!)
which is not referenced (easily? at all?) below VFS.

> > I think this is exactly what happens :) Actually, I believe, I'm just
> > getting another namecache entry with another vnode/nfsnode/file handle.
> 
> You can't have this for other reasons; specifically, if you have
> the file open at th time of the rename, and it becomes a ".#nfs..."
> file (or whatever) on the server.

I didn't trace "sillyrename" scenario much. But I believe, nfs_sillyrename()
keeps it tight. At least, it uses nfs_lookitup() which may actually
*update* the file handle. And it plays with the name cache purging as well.
So I don't consider it as a real problem.

However, for open for reading/writing the scenario looks quite clear for me.
As I said in my previous message to Don, I'm just trying to eliminate
the need to modify otherwise generic application to cope with the necessity
of doing immediate open() if the first open failed with ESTALE. For a certain
more or less common situation :)  And I know, the second open from the
userland application always works for the case I've described.

> Don points out that Solaris tries to fix this via the "noac" mount
> option for client NFS.

It does bad things to performance, though :)  I'm not trying to uncache
everything. It's safe for me to use file pagecache if open() succeeds.
I'm not trying to reach an absolute shared file integrity with NFS, believe
me :)

> 	{ A, B, C }
> fd1 open on B
> fd2 open on C
> rename B -> C
> rename A -> B
> 
> ?  With your patch, I think we would potentially convert fd2 to point
> to B whien it really *should* be "ESTALE", which is wrong (think in
> terms of 2 or more clients doing the operations).

You didn't specify client or server side, though. The result heavily
depends on the exact scenario.

With a single client, a new open() for "C" will result in fd2 if the
original "C" is still opened (because of sillyrename?).
Without fd2, any new open() for "C" will get a valid file handle for what
originally was "B". And that's a correct behaviour.

If the renames were on the server, then fd1 will be valid until the last
client's close. However, any reference to the original "C" will fail.
Re-opening "C" should result in a new file handle for what originally was "B".

Am I wrong?


More information about the freebsd-hackers mailing list