NFS == lock && reboot

Oliver Fromme olli at lurza.secnetix.de
Wed Apr 4 14:27:35 UTC 2007


Chris H. <chris#@1command.com> wrote:
 > Thomas David Rivers wrote:
 > > I have found that if I kill rpc.lockd on the NFS server,
 > > most of the NFS issues I have (including a similar lock-up on
 > > 6.1-RELEASE) go away.

FWIW, I also had problems with running rpc.lockd and
rpc.statd (no panics, though).  If you don't need them
(i.e. you don't need cross-machine locking), then don't
use them.  Use the -L flag to mount_nfs so at least
local locking works.

 > You don't happen to have any experiences keeping rpc.statd
 > running?

Basically, it doesn't make much sense to run one without
the other.  If you disable rpc.lockd, you can also safely
disable rpc.statd.

However, I don't think that your actual problem (lock-up
and panics) is related to rpc.lockd or rpc.statd.  It
rather sounds like something else is wrong with your
machine.  NFS works perfectly fine for me, including
copying huge files.

You wrote that you had a lot of crashes that accumulated
many files in lost+found.  Well, maybe your filesystem
was somehow damaged in the process.  It is possible to
damage file systems in a way that can lead to panics, and
it's not necessarily detected and repaired by fsck.

 > > > # cp /path/to/approx/10Mb/file /host/path/to/dest/dir/
 > > > 
 > > > Fatal double fault
 > > > eis 0x0blah
 > > > eiblah blah0x
 > > > panic double fault
 > > > no dump device defined

You should try to setup a dump device, so you get a kernel
crash dump next time.  The crash dump can be used to find
out where the crash occured -- and I bet it's not in the
NFS code.

See the Handbook for details on how to setup a dump device.

By the way, does the problem also occur when copying the
file to/from a memory disk, so no physical disk is involved?
That way you would exclude the disk and the disk driver as
potential causes.  Similarly, try a loopback NFS mount
(i.e. mount from 127.0.0.1) in order to exclude the network
interface driver as a potential cause.

If the problem still exists when copying a 10 MB file from
a memory disk to a memory disk (same or other) via a
localhost mount on the same machine, then it looks like
the NFS code might be at fault.

Best regards
   Oliver

-- 
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758,  Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

"C++ is the only current language making COBOL look good."
        -- Bertrand Meyer


More information about the freebsd-stable mailing list