NFS (& amd?) dysfunction descending a hierarchy

Kostik Belousov kostikbel at gmail.com
Wed Dec 10 09:06:27 PST 2008


On Wed, Dec 10, 2008 at 08:50:22AM -0800, David Wolfskill wrote:
> On Wed, Dec 10, 2008 at 11:30:26AM -0500, Rick Macklem wrote:
> >... 
> > The different behaviour for -CURRENT could be the newer RPC layer that
> > was recently introduced, but that doesn't explain the basic problem.
> 
> OK.
> 
> > All I can think of is to ask the obvious question. "Are you using
> > interruptible or soft mounts?" If so, switch to hard mounts and see
> > if the problem goes away. (imho, neither interruptible nor soft mounts
> > are a good idea. You can use a forced dismount if there is a crashed
> > NFS server that isn't coming back anytime soon.)
> 
> From examination of /etc/amd* -- I don't see how to get mount(8) or
> amq(8) to report it -- it appears that we are using interruptible
> mounts, as we always have.
> 
> The point is that the behavior has changed in an unexpected way.  And
> I'm not so sure that the use of a forced dismount is generally
> available, as it would require logging in to the NFS client first, which
> may be difficult if the NFS server hosting non-root home directories is
> failing to respond and direct root login via ssh(1) is not permitted (as
> is the default).
> 
> > If you are getting this with hard mounts, I'm afraid I have no idea
> > what the problem is, rick.
> 
> What concerns me is that even if the attempted unmount gets EBUSY, the
> user-level process descending the directory hierarchy is getting ENOENT
> trying to issue fstatfs() against an open file descriptor.
> 
> I'm having trouble figuring out any way that makes any sense.

Basically, the problem is that NFS uses shared lookup, and this allows
for the bug where several negative namecache entries are created for
non-existent node. Then this node gets created, removing only the first
negative namecache entry. For some reasons, vnode is reclaimed; amd'
tasting of unmount is a good reason for vnode to be reclaimed.

Now, you have existing path and a negative cache entry. This was
reported by Peter Holm first, I listed relevant revisions that
should fix this in previous mail.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20081210/f50acb49/attachment.pgp


More information about the freebsd-hackers mailing list