kern/165392: Multiple mkdir/rmdir fails with errno 31
Jilles Tjoelker
jilles at stack.nl
Mon May 27 17:00:03 UTC 2013
The following reply was made to PR kern/165392; it has been noted by GNATS.
From: Jilles Tjoelker <jilles at stack.nl>
To: Jaakko Heinonen <jh at FreeBSD.org>
Cc: bug-followup at FreeBSD.org, vvv at colocall.net, mckusick at FreeBSD.org
Subject: Re: kern/165392: Multiple mkdir/rmdir fails with errno 31
Date: Mon, 27 May 2013 18:53:28 +0200
On Mon, May 20, 2013 at 10:21:34PM +0300, Jaakko Heinonen wrote:
> On 2012-02-25, Jilles Tjoelker wrote:
> > > [mkdir fails with [EMLINK], but link count < LINK_MAX]
> > I can reproduce this problem with UFS with soft updates (with or without
> > journaling).
> > A reproduction without C programs is:
> > cd empty_dir
> > mkdir `jot 32766 1` # the last one will fail (correctly)
> > rmdir 1
> > mkdir a # will erroneously fail
> > The problem appears to be because the previous rmdir has not yet been
> > fully completed. It is still holding onto the link count until the
> > directory is written, which may take up to two minutes.
> > The same problem can occur with other calls that increase the link count
> > such as link() and rename().
> > A workaround is to call fsync() on the directory that contained the
> > deleted entries. It will then release its hold on the link count and
> > allow mkdir or other calls. If fsync() is only called when [EMLINK] is
> > returned, the performance impact should not be very bad, although it
> > still causes more I/O than necessary.
> I tried to implement this with the following patch:
> http://people.freebsd.org/~jh/patches/ufs-check_linkcnt.diff
> However, VOP_FSYNC(9) with the MNT_WAIT flag seems not to update the
> i_nlink count for a reason unknown to me. I can verify that also by
> taking your reproduction recipe above and adding "fsync ." between
> "rmdir 1" and "mkdir a".
fsync certainly helps but not as effectively as you'd want. Some
combination of sleeps, fsyncs and mkdir attempts appears to be needed. A
shell loop like
rmdir 8; fsync .; \
until mkdir h 2>/dev/null; do printf .; fsync .; sleep 1; done
takes two seconds.
However, in
rmdir 13; mkdir m; fsync .; \
until mkdir m 2>/dev/null; do printf .; sleep 1; done
the fsync is of no benefit. It is just as slow as omitting it (about
half a minute).
I must have taken long enough to type/recall the commands when I tried
this earlier. In my earlier experiments I gave the commands separately.
> Does this mean that fsync(2) is broken for directories on softdep
> enabled UFS?
I don't think fsync(2) has to sync the exact link count to disk, since
fsck will take care of that. However, it has to sync the timestamps,
permissions and directory entries.
> I have cc'd Kirk in hope he could shed some light on this.
I'm also interested in whether it is safe to call VOP_FSYNC at that
point, especially in the case of a rename where a lock on the source
directory vnode may be held at the same time.
--
Jilles Tjoelker
More information about the freebsd-fs
mailing list