rpc.lockd brokenness (2)

Thu Mar 9 03:28:04 UTC 2006

On Thu, Mar 09, 2006 at 03:12:24AM +0000, Miguel Lopes Santos Ramos wrote:
> > From: Kris Kennaway <kris at obsecurity.org>
> > Subject: Re: rpc.lockd brokenness (2)
> >
> > Yeah, the file is still locked on the server, and will never be
> > unlocked unless you stop and restart the rpc.lockd on the server
> > (which releases all the locks it holds).
> 
> I did that. Lots of times. And I removed /var/db/statd.status too when
> the daemons where not running.
> Is there any other file involved?

No: the locks are all held by the rpc.lockd process on the server, so
when that process is killed they are released (you can verify this by
running lockf on a locked file on the server before and after the
rpc.lockd is killed).

It does seem to take a few minutes for rpc.lockd on the client to
notice when rpc.lockd is restarted on the server, but lockf -t 0 on
the client will eventually succeed in my tests.

I didn't need to stop/restart rpc.statd for it to recover state.

> No.
> There is a problem with rpc.lockd besides the other one.
> This machine hangs even when I lock files with lockf -t 0 that never existed,
> with fresh statd/lockd on client AND server (if /var/db/statd.status is
> the only file involved).

lockf -t 0 is working for me when rpc.lockd is running.  It hangs when
the server is unreachable (e.g. rpc.lockd not running on the server)
or for a few minutes after it is (re)started (I filed a PR about that
too).

Can you try to narrow down this problem some more?  e.g. look up the
port used by rpc.lockd with rpcinfo on client and server and tcpdump
to see what locking requests are being passed back and forth (you
should see the request from client -> server and the reply granting
the lock; or not if something is going wrong).  The ethereal port is
useful for parsing the tcpdump -w -s 0 traces, btw; it decodes the RPC
packets into human-readable form.

Running rpc.lockd -d100 on the server is also useful for tracking down
what it's doing (look in /var/log/debug.log)

> If I keep using a common home directory for all machines, and keep using
> lockd for that mount on that machine, then my only workaround is still to
> go back to 6.0-RELEASE.

I'm not certain 6.0-RELEASE is any different, since I don't see any
changes to rpc.lockd or nfs locking that were made since then.

> BTW, thank you for your support. And they talk about technical support on
> fatly payed operating systems...

You're welcome.

Kris
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20060308/a5339e7b/attachment.bin