kern/165392: Multiple mkdir/rmdir fails with errno 31

Jilles Tjoelker jilles at stack.nl
Sat Feb 25 18:30:16 UTC 2012


The following reply was made to PR kern/165392; it has been noted by GNATS.

From: Jilles Tjoelker <jilles at stack.nl>
To: bug-followup at FreeBSD.org, vvv at colocall.net
Cc:  
Subject: Re: kern/165392: Multiple mkdir/rmdir fails with errno 31
Date: Sat, 25 Feb 2012 19:27:02 +0100

 > [mkdir fails with [EMLINK], but link count < LINK_MAX]
 
 I can reproduce this problem with UFS with soft updates (with or without
 journaling).
 
 A reproduction without C programs is:
 
 cd empty_dir
 mkdir `jot 32766 1`     # the last one will fail (correctly)
 rmdir 1
 mkdir a                 # will erroneously fail
 
 The problem appears to be because the previous rmdir has not yet been
 fully completed. It is still holding onto the link count until the
 directory is written, which may take up to two minutes.
 
 The same problem can occur with other calls that increase the link count
 such as link() and rename().
 
 A workaround is to call fsync() on the directory that contained the
 deleted entries. It will then release its hold on the link count and
 allow mkdir or other calls. If fsync() is only called when [EMLINK] is
 returned, the performance impact should not be very bad, although it
 still causes more I/O than necessary.
 
 The book "The Design and Implementation of the FreeBSD Operating System"
 contains a detailed description of soft updates in section 8.6 Soft
 Updates. The subsection "File Removal Requirements for Soft Updates"
 appears particularly relevant to this problem.
 
 A possible solution is to check for the problematic situation
 (i_effnlink < LINK_MAX && i_nlink >= LINK_MAX) and if so synchronously
 write one or more deleted directory entries that pointed to the inode
 with the link count problem. After that, i_nlink should be less than
 LINK_MAX and the link count can be checked again (depending on whether
 locks need to be dropped to do the write, it may or may not be possible
 for another thread to use up the last link first).
 
 For mkdir() and rename(), the directory that contains the deleted
 entries is obvious (the directory that will contain the new directory)
 while for link() it can (in the general case) only be found in soft
 updates data structures. Soft updates must track this because (if the
 link count became 0) it will not clear the inode before all directory
 entries that pointed to it have been written.
 
 Simply replacing the i_nlink < LINK_MAX check with i_effnlink < LINK_MAX
 is unsafe because it will lead to overflow of the 16-bit signed i_nlink
 field. If the field is made larger, I don't see how it is prevented that
 the code commits such a set of changes that an inode on disk has more
 than LINK_MAX links for some time (for example if a file in the new
 directory is fsynced while the old directory entries are still on the
 disk).
 
 -- 
 Jilles Tjoelker


More information about the freebsd-fs mailing list