panic in nfsd on 6.2-RC1

Sven Willenberger sven at dmv.com
Fri Dec 15 11:14:41 PST 2006


On Fri, 2006-12-15 at 13:15 -0500, Kris Kennaway wrote:
> On Fri, Dec 15, 2006 at 10:01:19AM -0500, Sven Willenberger wrote:
> > On Tue, 2006-12-05 at 12:38 +0900, Hiroki Sato wrote:
> > > Kostik Belousov <kostikbel at gmail.com> wrote
> > >   in <20061204160949.GM35681 at deviant.kiev.zoral.com.ua>:
> > > 
> > > ko> What version of sys/nfsserver/nfs_serv.c do you use ? If it is older than
> > > ko> 1.156.2.7, please, update the system.
> > > 
> > >  Thanks, I updated it just now and see how it works.
> > > 
> > > --
> > > | Hiroki SATO
> > 
> > I was/am having the same issue. Updating world (6.2-stable) to include
> > the above update sadly did not fix the problem for me. This is an amd64
> > box with only one client connecting to it via nfs. Reading further it
> > may seem to be an issue with rpc.statd and/or rpc.lockd. As I only have
> > one client connecting and it is being used as mail storage (i.e. the
> > client pops/imaps the storage) would be safe to not using fcntl forwards
> > over the wire? Is this same issue present in 6.1-RELENG? I am really at
> > my wits end at this point and for the first time am actually considering
> > moving to another OS (solaris more than likely) as I cannot have these
> > types of issues interrupting services every couple days.
> > 
> > What other information (spefically) can I provide to help the devs
> > figure out what is going on? What can I do in the meantime to have some
> > semblence of stability? I assume downgrading to 5.5-RELENG is out of the
> > question but perhaps disabling SMP?
> 
> Just to confirm, can you please post the panic backtrace you are
> seeing?  And can you explain what you mean by "may seem to be an issue
> with rpc.statd and/or rpc.lockd"?
> 
> Sometimes people think they're seeing the same problem as someone else
> when really it's a completely different problem in the same subsystem,
> so I'd like to rule that out here.
> 
> Kris

Well I have now added kdb and invariants/witness support to the kernel
so I should be able to get some backtrace the next time it happens.
Currently, the system just locks and no error is displayed on the
console or /var/log/messages; sorry I cannot be of immediate help there.

Regarding the rpc issue, I just ran across mention of those in sshfs/nfs
threads appearing here and in particular to a link referenced within one
of them (http://docs.freebsd.org/cgi/getmsg.cgi?fetch=1362611+0
+archive/2006/freebsd-stable/20060702.freebsd-stable ) - it is more than
likely not at all related but I am grasping at straws here trying to
solve this.

FWIW, I do see the following appearing in the /var/log/messages:
ufs_rename: fvp == tvp (can't happen) 
about once or twice a day, but cannot correlate those to lockup. Now
that I have enabled the options mentioned above in the kernel, I am
seeing some LOR issues:

kernel: lock order reversal:
kernel: 1st 0xffffff00c3bab200 kqueue (kqueue) @ /usr/src/sys/kern/kern_event.c:1547
kernel: 2nd 0xffffff0005bb6078 struct mount mtx (struct mount mtx) @ /usr/src/sys/ufs/ufs/ufs_vnops.c:138





More information about the freebsd-stable mailing list