NFS deadlock on 9.2-Beta1

Michael Tratz michael at esosoft.com
Wed Jul 24 21:26:47 UTC 2013


Two machines (NFS Server: running ZFS / Client: disk-less), both are running FreeBSD r253506. The NFS client starts to deadlock processes within a few hours. It usually gets worse from there on. The processes stay in "D" state. I haven't been able to reproduce it when I want it to happen. I only have to wait a few hours until the deadlocks occur when traffic to the client machine starts to pick up. The only way to fix the deadlocks is to reboot the client. Even an ls to the path which is deadlocked, will deadlock ls itself. It's totally random what part of the file system gets deadlocked. The NFS server itself has no problem at all to access the files/path when something is deadlocked on the client.

Last night I decided to put an older kernel on the system r252025 (June 20th). The NFS server stayed untouched. So far 0 deadlocks on the client machine (it should have deadlocked by now). FreeBSD is working hard like it always does. :-) There are a few changes to the NFS code from the revision which seems to work until Beta1. I haven't tried to narrow it down if one of those commits are causing the problem. Maybe someone has an idea what could be wrong and I can test a patch or if it's something else, because I'm not a kernel expert. :-)

I have run several procstat -kk on the processes including the ls which deadlocked. You can see them here:

http://pastebin.com/1RPnFT6r

I have tried to mount the file system with and without nolockd. It didn't make a difference. Other than that it is mounted with:

rw,nfsv3,tcp,noatime,rsize=32768,wsize=32768

Let me know if you need me to do something else or if some other output is required. I would have to go back to the problem kernel and wait until the deadlock occurs to get that information.

Thanks for your help,

Michael




More information about the freebsd-stable mailing list