open() and ESTALE error

Andrey Alekseyev uitm at blackflag.ru
Thu Jun 19 02:55:43 PDT 2003


Hello,

I've been trying lately to develop a solution for the problem with
open() that manifests itself in ESTALE error in the following situation:

1. NFS server: echo "1111" > file01
2. NFS client: cat file01
3. NFS server: echo "2222" > file02 && mv file02 file01
4. NFS client: cat file01 (either old file01 contents or ESTALE)

My study shows that actually the problem appears to be in VOP_ACCESS()
which is called from vn_open(). If nfs_access() decides to "go to the wire"
in #4, it then uses a cached file handle which is indeed stale. Thus,
open() eventually fails with ESTALE too (ESTALE comes from underlying
nfs_request()).

I understand all the fundamental NFS-related integrity problems, but not
this one :) That is, I see no reason for open() to fail to open a file for
reading or writing if the system knows the problem is it's own. Why not
just do another lookup and try obtain a valid file handle?

I was playing with different parts of the kernel while "fixing" this for
myself. However, I believe, the simpliest patch would be for
vfs_syscalls.c:open() (I've also made a working patch against vn_open(),
though).

Could anyone please be so kind to comment this issue?

TIA

--- kern/vfs_syscalls.c.orig	Thu Jun 19 13:22:50 2003
+++ kern/vfs_syscalls.c	Thu Jun 19 13:29:11 2003
@@ -1008,6 +1008,7 @@
 	int type, indx, error;
 	struct flock lf;
 	struct nameidata nd;
+	int stale = 0;
 
 	oflags = SCARG(uap, flags);
 	if ((oflags & O_ACCMODE) == O_ACCMODE)
@@ -1025,8 +1026,15 @@
 	 * the descriptor while we are blocked in vn_open()
 	 */
 	fhold(fp);
+again:
 	error = vn_open(&nd, flags, cmode);
 	if (error) {
+		/*
+		 * if the underlying filesystem returns ESTALE
+		 * we must have used a cached file handle.
+		 */
+		if (error == ESTALE && stale++ == 0)
+			goto again;
 		/*
 		 * release our own reference
 		 */


More information about the freebsd-hackers mailing list