Stale NFS file handles on 8.x amd64

Adam McDougall mcdouga9 at egr.msu.edu
Tue Nov 30 01:24:17 UTC 2010


I've been running dovecot 1.1 on FreeBSD 7.x for a while with a bare 
minimum of NFS problems, but it got worse with 8.x.  I have 2-4 servers 
(usually just 2) accessing mail on a Netapp over NFSv3 via imapd. 
delivery is via procmail which doesn't touch the dovecot metadata and 
webmail uses imapd.  Client connections to imapd go to random servers 
and I don't yet have solid means to keep certain users on certain 
servers.  I upgraded some of the servers to 8.x and dovecot 1.2 and ran 
into Stale NFS file handles causing index/uidlist corruption causing 
inboxes to appear as empty when they were not.  In some situations their 
corrupt index had to be deleted manually.  I first suspected dovecot 1.2 
since it was upgraded at the same time but I downgraded to 1.1 and its 
doing the same thing.  I don't really have a wealth of details to go on 
yet and I usually stay quiet until I do, and half the time it is 
difficult to reproduce myself so I've had to put it in production to get 
a feel for progress.  This only happens a dozen or so times per weekday 
but I feel the need to start taking bigger steps.  I'll probably do what 
I can to get IMAP back on a stable base (7.x?) and also try to debug 8.x 
on the remaining servers.  A binary search is within possibility if I 
can reproduce the symptoms often enough even if I have to put a test 
server in production for a few hours.

Any tips on where we could start looking, or alterations I could try 
making such as sysctls to return to older behavior?  It might be worth 
noting that I've seen a considerable increase in traffic from my mail 
servers since the 8.x upgrade timeframe, on the order of 5-10x as much 
traffic to the NFS server.  dovecot tries its hardest to flush out the 
access cache when needed and it was working well enough since about 
1.0.16 (years ago).  It seems like FreeBSD is what regressed in this 
scenario.  dovecot 2.x is going in a different direction from my 
situation and I'm not ready to start testing that immediately if I can 
avoid it as it will involve some restructuring.

Thanks for any input.  For now the following errors are about all I have 
to go on:

Nov 29 11:07:54 server1 dovecot: IMAP(user1): 
o_stream_send(/home/user1/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) 
failed: Stale NFS file handle
Nov 29 13:19:51 server1 dovecot: IMAP(user1): 
o_stream_send(/home/user1/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) 
failed: Stale NFS file handle
Nov 29 14:35:41 server1 dovecot: IMAP(user2): 
o_stream_send(/home/user2/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) 
failed: Stale NFS file handle
Nov 29 15:07:05 server1 dovecot: IMAP(user3): read(mail, uid=128990) 
failed: Stale NFS file handle

Nov 29 11:57:22 server2 dovecot: IMAP(user4): 
open(/egr/mail/shared/vprgs/dovecot-acl-list) failed: Stale NFS file handle
Nov 29 14:04:22 server2 dovecot: IMAP(user5): 
o_stream_send(/home/user5/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) 
failed: Stale NFS file handle
Nov 29 14:27:21 server2 dovecot: IMAP(user6): 
o_stream_send(/home/user6/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) 
failed: Stale NFS file handle
Nov 29 15:44:38 server2 dovecot: IMAP(user7): 
open(/egr/mail/shared/decs/dovecot-acl-list) failed: Stale NFS file handle
Nov 29 19:04:54 server2 dovecot: IMAP(user8): 
o_stream_send(/home/user8/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) 
failed: Stale NFS file handle

Nov 29 06:32:11 server3 dovecot: IMAP(user9): 
open(/egr/mail/shared/cmsc/dovecot-acl-list) failed: Stale NFS file handle
Nov 29 10:03:58 server3 dovecot: IMAP(user10): 
o_stream_send(/home/user10/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) 
failed: Stale NFS file handle


More information about the freebsd-stable mailing list