NFS 75 second stall
Garrett Cooper
yanefbsd at gmail.com
Thu Jul 1 19:23:59 UTC 2010
On Thu, Jul 1, 2010 at 11:51 AM, alan bryan <alan.bryan at yahoo.com> wrote:
>
>
> --- On Thu, 7/1/10, Garrett Cooper <yanefbsd at gmail.com> wrote:
>
>> From: Garrett Cooper <yanefbsd at gmail.com>
>> Subject: Re: NFS 75 second stall
>> To: "alan bryan" <alan.bryan at yahoo.com>
>> Cc: freebsd-stable at freebsd.org
>> Date: Thursday, July 1, 2010, 11:13 AM
>> On Thu, Jul 1, 2010 at 11:01 AM, alan
>> bryan <alan.bryan at yahoo.com>
>> wrote:
>> > Setup:
>> >
>> > server - FreeBSD 8-stable from today. 2 UFS dirs
>> exported via NFS.
>> > client - FreeBSD 8.0-Release. Running a test php
>> script that copies around various files to/from 2 separate
>> NFS mounts.
>> >
>> > Situation:
>> >
>> > script is started (forked to do 20 simultaneous runs)
>> and 20 1GB files are copied to the NFS dir which works
>> fine. When it then switches to reading those files back
>> and simultaneously writing to the other NFS mount I see a
>> hang of 75 seconds. If I do an "ls -l" on the NFS mount it
>> hangs too. After 75 seconds the client has reported:
>> >
>> > nfs server 192.168.10.133:/usr/local/export1: not
>> responding
>> > nfs server 192.168.10.133:/usr/local/export1: is alive
>> again
>> > nfs server 192.168.10.133:/usr/local/export1: not
>> responding
>> > nfs server 192.168.10.133:/usr/local/export1: is alive
>> again
>> >
>> > and then things start working again. The server was
>> originally FreeBSD 8.0-Release also but was upgraded to the
>> latest stable to see if this issue could be avoided.
>> >
>> > # nfsstat -s -W -w 1
>> > GtAttr Lookup Rdlink Read Write Rename
>> Access Rddir
>> > 0 0 0 222 257
>> 0 0 0
>> > 0 0 0 178 135
>> 0 0 0
>> > 0 0 0 85 127
>> 0 0 0
>> > 0 0 0 0 0
>> 0 0 0
>> > 0 0 0 0 0
>> 0 0 0
>> > 0 0 0 0 0
>> 0 0 0
>> > 0 0 0 0 0
>> 0 0 0
>> > 0 0 0 0 0
>> 0 0 0
>> >
>> > ... for 75 rows of all zeros
>> >
>> > 0 0 0 272 266
>> 0 0 0
>> > 0 0 0 167 165
>> 0 0 0
>> >
>> > I also tried runs with 15 simultaneous processes and
>> 25. 15 processes gave only about a 5 second stall but 25
>> gave again the same 75 second stall.
>> >
>> > Further, I tested with 2 mounts to the same server but
>> from ZFS filesytems with the exact same stall/timeout
>> periods. So, it doesn't appear to matter what the
>> underlying filesystem is - it's something in NFS or
>> networking code.
>> >
>> > Any ideas on what's going on here? What's causing
>> the complete stall period of zero NFS activity? Any flaws
>> with my testing methods?
>> >
>> > Thanks for any and all help/ideas.
>>
>> What network driver are you using? Have you tried
>> tcpdumping the packets?
>> -Garrett
>>
>
> I'm using igb currently but have also used em. I have not tried tcpdumping the packets yet on this test. Any suggestions on things to look out for (I'm not that familiar with that whole process).
>
> Which brings up another point - I'm using TCP connections for NFS, not UDP.
Is the net.inet.tcp.tso sysctl enabled or not? What about rxcsum and txcsum?
Thanks,
-Garrett
More information about the freebsd-stable
mailing list