Deadlock between nfsd and snapshots.

Tor Egge Tor.Egge at cvsup.no.freebsd.org
Tue Aug 22 21:47:18 UTC 2006


> I have a proposal.
> 
> 1. Remove IN_ACCESS, IN_UPDATE, IN_CHANGE from i_flag. For each flag,
> introduce two new i_ fields, e.g., i_access of type timespec, and
> i_accessed of boolean type.

On amd64, sizeof(struct timespec) is 16 bytes and sizeof(struct boolean_t) is 4
bytes.  3 * (16 + 4) = 60 bytes extra per inode.  With 100K inodes that becomes
6 MB extra memory.

I don't see why all these extra fields are needed.

> 2. All places that currently set IN_ACCESS, instead would increment
> i_accessed using the atomic ops. ufs_itimes shall update i_access under some
> mutex if i_accessed is greater than zero.

Protecting the existing i_flag and the timestamps with the vnode interlock when
the current thread only has a shared vnode lock should be sufficient to protect
against the races, removing the need for #3, #4 and #4 below.

What's left is avoiding setting IN_MODIFIED when it's unsafe, to protect
against the deadlock.

> 3. Check the i_access instead of the IN_ACCESS.

> 4. ffs_update and ffs_syncvnode shall do the DIP_SET(i_atime) under the mutex
> from #2 before the main run and set IN_MODIFIED accordingly if i_accessed is
> not 0.

> 4. ufs_getattr shall retrieve the *time from new i_ fields under the mutex
> from #2 if corresponding i_ flag is set.

> Basically, I want to set IN_MODIFIED i_flag (induced by IN_ACCESS and others)
> only under exclusive vnode lock. Moreover, i_accessed can be zeroed only
> under exclusive lock. This way, even shared lock on the vnode shall be enough
> to safely update modification times, and the times are moved to the disk
> often enough (at least, at the sync of the syncer vnodes).

An exclusive vnode lock isn't needed, see above.  Holding an exclusive vnode
lock does not make it safe to set IN_MODIFIED.

There are some constraints with regards to setting IN_MODIFIED on an inode.

If neither IN_CHANGE nor IN_UPDATE is set then it might be unsafe to set
IN_MODIFIED since the file system might be suspended or in the process of being
suspended with the vnode sync loop in ffs_sync() having iterated past the
vnode.

If the file system is suspended then IN_MODIFIED cannot be set. If IN_MODIFIED,
IN_CHANGE or IN_UPDATE is set and the file system is suspended then something
is wrong.

If the file system is in the process of being suspended then IN_MODIFIED can be
set at the cost of triggering a restart of the vnode sync loop in ffs_sync().
If either IN_MODIFIED, IN_CHANGE or IN_UPDATE is already set then the vnode
sync loop has not reached the vnode, and a restart isn't needed.

When ufs_itimes() cannot set IN_MODIFIED then it has to either risk losing the
access time update or use some mechanism to defer it (e.g. set IN_LAZYMOD or
a new flag and let process_deferred_inactive() set IN_MODIFIED after the file
system has been resumed).

- Tor Egge


More information about the freebsd-fs mailing list