NFSD lockup running ESXi 4

Brian Somers brian at FreeBSD.org
Wed Jun 9 21:56:44 UTC 2010


On Wed,  9 Jun 2010 13:52:40 -0400 (EDT) rhfb at akira.stdio.com wrote:
> I have an AMD64 FreeBSD 8.0 running 8-Stable from around 2010/04/25 19:13:08.
> 
> ZFS disk, Nfsd flags "-t -n 16", private network exclusive for nfs network,
> not using jumbo frames, HZ=1000, Device_Polling, Zero_Copy_Sockets, and the
> following sysctl options:
> net.inet.tcp.recvspace=232140
> net.inet.tcp.sendspace=232140
> net.inet.tcp.slowstart_flightsize=159
> net.inet.tcp.mssdflt=1460
> 
> FreeBSD 6 TB zpool, nfs from Three ESXi 4 (newest patch level 193498)
> working reliably for months.
> 
> Added a new ESXi, patched to the newest (Post Update 1) patch level 256968.
> Added a bunch of VM's, booted them all into the 2008 R2 Server install DVD.
> Then when attempting to do the installs (in parallel/simultaneously) I started
> getting the NFS server locking up.  NFSD would wedge at 100% CPU in "rc_lo"
> which I presume is rc_lock?  Once wedged, /etc/rc.d/nfsd restart can't kill
> nfsd.  So a reboot is required.  A Reboot causes all my active VM's with
> pending disk writes to have disk errors in the VM (10 second default timeout
> for disk writes in the VM.)  This was very reproducable.
> 
> Has anyone noticed this problem?  Is this an ESXi problem with the newest
> updates?  Is this a problem with NFS on FreeBSD 8?

I don't know if it's relevant, but I've been having nfs issues on -current.
I believe they were caused by gam_server, a gnome program running on an
NFS client machine that had /usr/ports nfs mounted and was doing a portupgrade.
Nothing gnomeish should have been anywhere near /usr/ports, but analysis
showed huge numbers of NFS stats against /usr/ports/distfiles/*, restat'ing
the same files over and over.  nfsd was going crazy on the server and
gam_server was clocking up wads of CPU time on the client.

FreeBSD-9 kernels prior to around June 6 were freezing on me.  It may have
been because of the nfsd activity, but I didn't investigate the freeze...

Perhaps looking for changes that might might affect nfsd stability in the week
prior to June 6 might discover a fix?

-- 
Brian Somers                                          <brian at Awfulhak.org>
Don't _EVER_ lose your sense of humour !               <brian at FreeBSD.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 306 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20100609/d27e3767/signature.pgp


More information about the freebsd-hackers mailing list