NFS 75 second stall

alan bryan alan.bryan at yahoo.com
Thu Jul 1 18:51:34 UTC 2010



--- On Thu, 7/1/10, Garrett Cooper <yanefbsd at gmail.com> wrote:

> From: Garrett Cooper <yanefbsd at gmail.com>
> Subject: Re: NFS 75 second stall
> To: "alan bryan" <alan.bryan at yahoo.com>
> Cc: freebsd-stable at freebsd.org
> Date: Thursday, July 1, 2010, 11:13 AM
> On Thu, Jul 1, 2010 at 11:01 AM, alan
> bryan <alan.bryan at yahoo.com>
> wrote:
> > Setup:
> >
> > server - FreeBSD 8-stable from today.  2 UFS dirs
> exported via NFS.
> > client - FreeBSD 8.0-Release.  Running a test php
> script that copies around various files to/from 2 separate
> NFS mounts.
> >
> > Situation:
> >
> > script is started (forked to do 20 simultaneous runs)
> and 20 1GB files are copied to the NFS dir which works
> fine.  When it then switches to reading those files back
> and simultaneously writing to the other NFS mount I see a
> hang of 75 seconds.  If I do an "ls -l" on the NFS mount it
> hangs too.  After 75 seconds the client has reported:
> >
> > nfs server 192.168.10.133:/usr/local/export1: not
> responding
> > nfs server 192.168.10.133:/usr/local/export1: is alive
> again
> > nfs server 192.168.10.133:/usr/local/export1: not
> responding
> > nfs server 192.168.10.133:/usr/local/export1: is alive
> again
> >
> > and then things start working again.  The server was
> originally FreeBSD 8.0-Release also but was upgraded to the
> latest stable to see if this issue could be avoided.
> >
> > # nfsstat -s -W -w 1
> >  GtAttr Lookup Rdlink   Read  Write Rename
> Access  Rddir
> >       0      0      0    222    257   
>   0      0      0
> >       0      0      0    178    135   
>   0      0      0
> >       0      0      0     85    127 
>     0      0      0
> >       0      0      0      0      0 
>     0      0      0
> >       0      0      0      0      0 
>     0      0      0
> >       0      0      0      0      0 
>     0      0      0
> >       0      0      0      0      0 
>     0      0      0
> >       0      0      0      0      0 
>     0      0      0
> >
> > ... for 75 rows of all zeros
> >
> >       0      0      0    272    266   
>   0      0      0
> >       0      0      0    167    165   
>   0      0      0
> >
> > I also tried runs with 15 simultaneous processes and
> 25.  15 processes gave only about a 5 second stall but 25
> gave again the same 75 second stall.
> >
> > Further, I tested with 2 mounts to the same server but
> from ZFS filesytems with the exact same stall/timeout
> periods.  So, it doesn't appear to matter what the
> underlying filesystem is - it's something in NFS or
> networking code.
> >
> > Any ideas on what's going on here?  What's causing
> the complete stall period of zero NFS activity?   Any flaws
> with my testing methods?
> >
> > Thanks for any and all help/ideas.
> 
> What network driver are you using? Have you tried
> tcpdumping the packets?
> -Garrett
> 

I'm using igb currently but have also used em.  I have not tried tcpdumping the packets yet on this test.  Any suggestions on things to look out for (I'm not that familiar with that whole process).

Which brings up another point - I'm using TCP connections for NFS, not UDP.  

--Alan



      


More information about the freebsd-stable mailing list