amd64/161493: NFS v3 directory structure update slow

George Breahna george at polarismail.com
Thu Oct 13 02:10:12 UTC 2011


The following reply was made to PR kern/161493; it has been noted by GNATS.

From: "George Breahna" <george at polarismail.com>
To: "'John Baldwin'" <jhb at freebsd.org>,
	<freebsd-amd64 at freebsd.org>
Cc: <freebsd-gnats-submit at freebsd.org>,
	"'Rick Macklem'" <rmacklem at freebsd.org>
Subject: RE: amd64/161493: NFS v3 directory structure update slow
Date: Wed, 12 Oct 2011 21:41:34 -0400

 I can also confirm that using -o option in nfsd makes the problem go away.
 
 George
 
 -----Original Message-----
 From: John Baldwin [mailto:jhb at freebsd.org] 
 Sent: Wednesday, October 12, 2011 9:30 AM
 To: freebsd-amd64 at freebsd.org
 Cc: George Breahna; freebsd-gnats-submit at freebsd.org; Rick Macklem
 Subject: Re: amd64/161493: NFS v3 directory structure update slow
 
 On Tuesday, October 11, 2011 11:07:13 am George Breahna wrote:
 > 
 > >Number:         161493
 > >Category:       amd64
 > >Synopsis:       NFS v3 directory structure update slow
 > >Confidential:   no
 > >Severity:       critical
 > >Priority:       high
 > >Responsible:    freebsd-amd64
 > >State:          open
 > >Quarter:        
 > >Keywords:       
 > >Date-Required:
 > >Class:          sw-bug
 > >Submitter-Id:   current-users
 > >Arrival-Date:   Tue Oct 11 15:10:07 UTC 2011
 > >Closed-Date:
 > >Last-Modified:
 > >Originator:     George Breahna
 > >Release:        9.0 Beta 2
 > >Organization:
 > >Environment:
 > FreeBSD store2 9.0-BETA2 FreeBSD 9.0-BETA2 #0: Sun Sep 18 22:02:45 EDT
 2011     
 pulsar at store2.emailarray.com:/usr/obj/usr/src/sys/PULSAR  amd64
 > >Description:
 > We used to run a NFS server on FreeBSD 6.2 but we built a new box recently
 
 and installed 9.0 Beta 2 on it. The data was moved over as it serves as the 
 back-end for a mail system. It runs NFS v3 over TCP only and all the NFS-
 related processes (rpcbind, mountd, lockd, etc ) run with the -h switch and 
 bind to the local IP address.
 > 
 > The NFS server exports the data to 7 NFS clients ranging from FreeBSD 6.1
 to 
 8.2, the majority being 8.2 The mount on the NFS clients is done simply with
 -
 o tcp,rsize=32768,wsize=32768
 > 
 > Usual file operations, such as accessing files, creating directories, 
 removing files, chmod, chown, etc work perfectly but we noticed there were 
 issues in removing directories that contained data. We had a strange error:
 > 
 > rm -rf nick/
 > rm: fts_read: Input/output error
 > 
 > Using 'truss' on rm revealed this:
 > 
 > open("..",O_RDONLY,00)                           ERR#5 'Input/output
 error'
 > 
 > After much testing and debugging we realized the problem is in the NFS 
 protocol. ( either server or client but we assume server since this used to 
 work very well with FreeBSD 6.2 ). The problem appears to be that NFS does
 not 
 show the '..' after modifying a directory structure. Take the following 
 example executed on a FreeBSD 8.2 client accessing the NFS share from the 
 9.0B2 server:
 > 
 > imap5# mkdir test1
 > imap5# cd test1
 > imap5# touch file1
 > imap5# touch file2
 > imap5# ls -la
 > ls: ..: Input/output error
 > total 4
 > drwxr-xr-x  2 root  vchkpw  512 Oct 11 10:55 .
 > -rw-r--r--  1 root  vchkpw    0 Oct 11 10:55 file1
 > -rw-r--r--  1 root  vchkpw    0 Oct 11 10:55 file2
 > 
 > Notice the '..' is missing from the display. If we now try and remove the 
 directory 'test1' it will throw the "rm: fts_read: Input/output error"
 error.
 > 
 > If we wait in between 1 minute and 5 minutes, '..' will eventually appear
 by 
 itself. During this whole time, '..' effectively exists on the NFS server
 but 
 it's not displayed by any of the NFS clients.
 > 
 > I can force the NFS client to show it faster by doing an ls -la from the 
 parent level. For example:
 > 
 > imap5# mkdir test1
 > imap5# touch test1/file1
 > imap5# touch test1/file2
 > imap5# touch test1/file3
 > imap5# ls -la test1
 > total 8
 > drwxr-xr-x   2 root      vchkpw   512 Oct 11 10:59 .
 > drwx------  10 vpopmail  vchkpw  1024 Oct 11 10:59 ..
 > -rw-r--r--   1 root      vchkpw     0 Oct 11 10:59 file1
 > -rw-r--r--   1 root      vchkpw     0 Oct 11 10:59 file2
 > -rw-r--r--   1 root      vchkpw     0 Oct 11 10:59 file3
 > imap5# cd test1
 > imap5# ls -la
 > total 8
 > drwxr-xr-x   2 root      vchkpw   512 Oct 11 10:59 .
 > drwx------  10 vpopmail  vchkpw  1024 Oct 11 10:59 ..
 > -rw-r--r--   1 root      vchkpw     0 Oct 11 10:59 file1
 > -rw-r--r--   1 root      vchkpw     0 Oct 11 10:59 file2
 > -rw-r--r--   1 root      vchkpw     0 Oct 11 10:59 file3
 > 
 > but if we wait 5 seconds after that display and try again:
 > 
 > ls -la
 > ls: ..: Input/output error
 > total 4
 > drwxr-xr-x  2 root  vchkpw  512 Oct 11 10:59 .
 > -rw-r--r--  1 root  vchkpw    0 Oct 11 10:59 file1
 > -rw-r--r--  1 root  vchkpw    0 Oct 11 10:59 file2
 > -rw-r--r--  1 root  vchkpw    0 Oct 11 10:59 file3
 > 
 > Again, if we wait longer ( 1-5 minutes ), the '..' will properly appear in
 
 there.
 > 
 > There are no error messages on the console or other log files. This is 
 reproducible 100% of the time with any FreeBSD client. Have tried 
 unmounting/remounting several times without any effect. Also tried different
 
 rsize/wsize, no effect. I think there is some delay in updating the
 directory 
 structure and it's causing this bug.
 > 
 > Here's also some output from nfsstat on the server:
 > 
 > 
 > Server Info:
 >   Getattr   Setattr    Lookup  Readlink      Read     Write    Create    
 Remove
 > 114731225  20496896 254966151       133  11697392  19963641         0   
 9228861
 >    Rename      Link   Symlink     Mkdir     Rmdir   Readdir  RdirPlus    
 Access
 >   4313471   1157651        39      1955  16511932  15479669         0 
 116927742
 >     Mknod    Fsstat    Fsinfo  PathConf    Commit
 >         0   4748487        48         0  14921747
 > Server Ret-Failed
 >                 0
 > Server Faults
 >             0
 > Server Cache Stats:
 >    Inprog      Idem  Non-idem    Misses
 >         0         0         0 613368147
 > Server Write Gathering:
 >  WriteOps  WriteRPC   Opsaved
 >  19963641  19963641         0
 > 
 > >How-To-Repeat:
 > imap5# mkdir test1
 > imap5# cd test1
 > imap5# touch file1
 > imap5# touch file2
 > imap5# ls -la
 > ls: ..: Input/output error
 > total 4
 > drwxr-xr-x  2 root  vchkpw  512 Oct 11 10:55 .
 > -rw-r--r--  1 root  vchkpw    0 Oct 11 10:55 file1
 > -rw-r--r--  1 root  vchkpw    0 Oct 11 10:55 file2
 > >Fix:
 
 Can you try using the "old" NFS server as a test?
 
 -- 
 John Baldwin
 


More information about the freebsd-fs mailing list