kern/79700: suspending nfs file access hangs other access to the file

Dylan Simon dylan at dylex.net
Fri Apr 8 15:10:36 PDT 2005


>Number:         79700
>Category:       kern
>Synopsis:       suspending nfs file access hangs other access to the file
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Apr 08 22:10:35 GMT 2005
>Closed-Date:
>Last-Modified:
>Originator:     Dylan Simon
>Release:        FreeBSD 5.3-RELEASE-p5 i386
>Organization:
Rainfinity
>Environment:
System: FreeBSD druid.rainfinity.prv 5.3-RELEASE-p5 FreeBSD 5.3-RELEASE-p5 #1: Tue Mar 8 14:36:00 PST 2005 dylan at druid.rainfinity.prv:/usr/obj/usr/src/sys/QUARK i386      
>Description:
Suspending process that is actively writing to a file on an nfs mount causes
future access to that file by other processes to hang until the first process
is resumed.

A tcpdump shows all outstanding WRITE operations have successfully completed,
and the ACCESS to the file is not sent until the process is resumed (although
access to other files on that mount continues without interruption).

The hung process is in state D.  A truss of the hung process shows it hanging
before the stat() of the file.  For example, if we suspending a cp to file3,
and ls -l of the directory shows:

open(".",0x4,05001257435)                        = 6 (0x6)
fstat(6,0xbfbfde60)                              = 0 (0x0)
fcntl(6,F_SETFD,0x1)                             = 0 (0x0)
break(0x8054000)                                 = 0 (0x0)
__sysctl(0xbfbfdc18,0x2,0x281b3f1c,0xbfbfdc14,0x0,0x0) = 0 (0x0)
fstatfs(0x6,0xbfbfdc80)                          = 0 (0x0)
break(0x8055000)                                 = 0 (0x0)
fstat(6,0xbfbfde60)                              = 0 (0x0)
fchdir(0x6)                                      = 0 (0x0)
getdirentries(0x6,0x8054000,0x1000,0x8053014)    = 512 (0x200)
lstat("file0",0x8052248)                         = 0 (0x0)
lstat("file1",0x8052348)                         = 0 (0x0)
lstat("file2",0x8052448)                         = 0 (0x0)
<hang here until resume>
lstat("file3",0x8052548)                         = 0 (0x0)
getdirentries(0x6,0x8054000,0x1000,0x8053014)    = 0 (0x0)
lseek(6,0x0,SEEK_SET)                            = 0 (0x0)
close(6)                                         = 0 (0x0)

This has been reproduced on 5.2.1 as well as 5.3.  It happens about half the
time.

>How-To-Repeat:
> mount -t nfs -o intr server:/export /mnt/pathMount can be soft or hard.  Only tried udp.  Happened on fast (100Mb) and slow(T1) connections to server.> cd /mnt/path> cp big_file file2> ^ZSuspend the copy in the middle.Now, on any other shell:> cd /mnt/path> ls -lThis hangs until resuming the suspended copy.This will work about half the time.  The other half, the ls -l will completefine.  Resuming the copy and resuspending it often causes the problem again.
>Fix:
      
>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list