Fwd: Re: FreeBSD NFS client goes into infinite retry loop

Steve Polyack korvus at comcast.net
Tue Mar 23 13:37:59 UTC 2010

On 03/22/10 19:53, Rick Macklem wrote:
> On Mon, 22 Mar 2010, John Baldwin wrote:
> >>  It looks like it also returns ESTALE when the inode is invalid (<
> >>  ROOTINO ||>  max inodes?) - would an unlinked file in FFS referenced at
> >>  a later time report an invalid inode?
> >>
> I'm no ufs guy, but the only way I can think of is if the file system
> on the server was newfs'd with fewer i-nodes? (Unlikely, but...)
> (Basically, it is safe to return ESTALE for anything that is not
>   a transient failure that could recover on a retry.)
> >>  But back to your point, zfs_zget() seems to be failing and returning the
> >>  EINVAL before zfs_fhtovp() even has a chance to set and check zp_gen.
> >>  I'm trying to get some more details through the use of gratuitous
> >>  dprintf()'s, but they don't seem to be making it to any logs or the
> >>  console even with vfs.zfs.debug=1 set.  Any pointers on how to get these
> >>  dprintf() calls working?
> I know diddly (as in absolutely nothing about zfs).
> >
> >  That I have no idea on.  Maybe Rick can chime in?  I'm actually not sure why
> >  we would want to treat a FHTOVP failure as anything but an ESTALE error in the
> >  NFS server to be honest.
> >
> As far as I know, only if the underlying file system somehow has a
> situation where the file handle can't be translated at that point in time,
> but could be able to later. I have no idea if any file system is like that
> and I don't such a file system would be an appropriate choice for an NFS
> server, even if such a beast exists. (Even then, although FreeBSD's client
> assumes EIO might recover on a retry, that isn't specified in any RFC, as
> far as I know.)
> That's why I proposed a patch that simply translates all VFS_FHTOVP()
> errors to ESTALE in the NFS server. (It seems simpler than chasing down
> cases in all the underlying file systems?)
> rick, chiming in:-)

Makes sense to me.  I'll continue to bang on NFS with your initial patch 
in my lab for a while.  Should I open a PR for further discussion / 
resolution of the issue in -CURRENT / STABLE?

Steve Polyack

