Suddenly frozen fcntl/stat call on NFS over TCP with MTU 9000

John Baldwin jhb at freebsd.org
Mon Sep 15 20:49:43 UTC 2008


On Monday 15 September 2008 11:57:02 am Tim Chen wrote:
> Currently I was running a mail server using a netapp filer as backend
> storage.
> >From time to time, the whole system get stuck and lasted for 3-5 minutes.
> But
> after that, everything recovers normally. During the "stuck" moment, using
> ps
> auxw shows 200-300 of mail delivery agent(MDA) processes staying in "D"
> status.
> The command df certainly does not reponse either.

Can you use 'ps axl' to determine the wait mesg ("wchan") of the stuck threads 
when they hang?  If it is "lockf", then make sure you have an up-to-date 
RELENG_6 kernel as there was a recent fix for a "lockf" hang.

Alternatively, if things are stuck in "nfsreq", it may be useful to use 
tcpdump to look at the NFS requests your client is making.  nfsstat can also 
be useful as you can see which counters are increasing during a hang.

-- 
John Baldwin


More information about the freebsd-stable mailing list