NFSD lockup running ESXi 4

rhfb at akira.stdio.com rhfb at akira.stdio.com
Wed Jun 9 18:11:59 UTC 2010


I have an AMD64 FreeBSD 8.0 running 8-Stable from around 2010/04/25 19:13:08.

ZFS disk, Nfsd flags "-t -n 16", private network exclusive for nfs network,
not using jumbo frames, HZ=1000, Device_Polling, Zero_Copy_Sockets, and the
following sysctl options:
net.inet.tcp.recvspace=232140
net.inet.tcp.sendspace=232140
net.inet.tcp.slowstart_flightsize=159
net.inet.tcp.mssdflt=1460

FreeBSD 6 TB zpool, nfs from Three ESXi 4 (newest patch level 193498)
working reliably for months.

Added a new ESXi, patched to the newest (Post Update 1) patch level 256968.
Added a bunch of VM's, booted them all into the 2008 R2 Server install DVD.
Then when attempting to do the installs (in parallel/simultaneously) I started
getting the NFS server locking up.  NFSD would wedge at 100% CPU in "rc_lo"
which I presume is rc_lock?  Once wedged, /etc/rc.d/nfsd restart can't kill
nfsd.  So a reboot is required.  A Reboot causes all my active VM's with
pending disk writes to have disk errors in the VM (10 second default timeout
for disk writes in the VM.)  This was very reproducable.

Has anyone noticed this problem?  Is this an ESXi problem with the newest
updates?  Is this a problem with NFS on FreeBSD 8?


More information about the freebsd-hackers mailing list