deadlocks with intr NFS mounts and ^Z (or: PCATCH and sleepable locks)

Bruce Evans brde at optusnet.com.au
Sat Jun 20 04:45:23 UTC 2009


On Fri, 19 Jun 2009, Kostik Belousov wrote:

> On Fri, Jun 19, 2009 at 06:23:28PM +0200, Jilles Tjoelker wrote:
>> I have been having trouble with deadlocks with NFS mounts for a while,
>> and I have found at least one way it can deadlock. It seems an issue
>> with the sleep/lock system.
>>
>> NFS sleeps while holding a lockmgr lock, waiting for a reply from the
>> server. When the mount is set intr, this is an interruptible sleep, so
>> that interrupting signals can abort the sleep. However, this also means
>> that SIGSTOP etc will suspend the thread without waking it up first, so
>> it will be suspended with a lock held.
>>
>> If it holds the wrong locks, it is possible that the shell will not be
>> able to run, and the process cannot be continued in the normal manner.
>>
>> Due to some other things I do not understand, it is then possible that
>> the process cannot be continued at all (SIGCONT seems ignored), but in
>> simple cases SIGCONT works, and things go back to normal.
>> ...
>> Also, making SIGSTOP and the like interrupt/restart syscalls is not
>> acceptable unless you find some way to do it such that userland won't
>> notice. For example, a read of 10 megabytes from a regular file with
>> that much available must not return less then 10 megabytes.
>
> See
> http://lists.freebsd.org/pipermail/freebsd-smp/2009-January/001611.html

Have any fixes been applied?  I now remember seeing problems like the
first set above on FreeBSD cluster machines (I don't encounter "intr"
nfs mounts anywhere else; mount(8) still doesn't show the "intr"
option so I assume that the "intr" specified in fstab is in use on
the FreeBSD machines):

     normal resume after ^Z on a parallel build not working, sometimes
     hanging the whole file system but other times recoverable after
     re-logging in and sending suitable SIGCONTs manually

These problems seemed to go away, but right now the following problem
like the second set above occurs consistently (I first noticed this
last week):

     Script started on Sat Jun 20 02:32:51 2009
     pts/0:bde at ref8-i386:~/sys7/i386/compile> sh zm
     ^Z
     [1]+  Stopped                 sh zm
     pts/0:bde at ref8-i386:~/sys7/i386/compile> %
     sh zm
     *** Stopped -- signal 18
     *** Stopped -- signal 18
     *** Stopped -- signal 18
     *** Signal 1
     *** Signal 1
     *** Signal 1
     `all' not remade because of errors.
     linking kernel
     ^C
     pts/0:bde at ref8-i386:~/sys7/i386/compile> exit

     Script done on Sat Jun 20 02:34:41 2009

The shell script zm builds 6 kernels in parallel using make -k -j8 for
each.  Signal 18 is SIGTSTP.  Receiving this is normal, but the shell
shouldn't print any meesages about it.  Signal 1 is SIGHUP.  This
shouldn't occur.  On another run, ISTR getting messages about i/o
errors or unrestartable processes.  Anyway, the messages about signals
are associated with failing jobs in the build.

ref7-i386 now behaves normally -- ^Z and resume just work; no messages
are printed and the build completes successfully after resuming.

Bruce


More information about the freebsd-arch mailing list