ZFS - NFS server for VMware ESXi issues

Rick Macklem rmacklem at uoguelph.ca
Sat Oct 22 03:06:15 UTC 2016

Marek Salwerowicz wrote:

Stuff snipped for brevity...

> Today, after two weeks of working, we experienced the same situation.
> The nfsd service was in following state:
>   984 root        128  20    0 12344K  4020K vq->vq  8 346:27 0.00% nfsd
> nfsd service didn't respond to service nfsd restart, but this time
> machine was able to reboot using "# reboot" command.
I am not sure how "top" got a STATE of "vq->vq", but I suspect that refers to the
vdev section of the ZFS code. (The only other place in the kernel where "vq->vq"
shows up is in virtio and I doubt you are using that?)

I'm not a ZFS guy so I can't help, but I'd guess that it's looping around in the vdev
code, possibly competing for the vq->vq_lock?

Hopefully someone with ZFS expertise can help out?

Btw, about the only area of the NFS server that might need tuning is the DRC and
this doesn't suggest that. If you "nfsstat -e -s" on the server and see large #s for
the last line under "Server Cache Stats:" there are tunables that can be used.
I'd also suggest you capture the output of "ps axHl" on the server when it happens
again, which tells you what all the nfsd threads are up to.

Good luck with it, rick

