rpc.lockd brokenness (2)

Kris Kennaway kris at obsecurity.org
Thu Mar 9 00:57:24 UTC 2006

On Thu, Mar 09, 2006 at 12:26:44AM +0000, Miguel Lopes Santos Ramos wrote:
> > From: Kris Kennaway <kris at obsecurity.org>
> > Subject: Re: rpc.lockd brokenness (2)
> >
> > This is intentional.  It's how pidfile_*() tests whether the process
> > is still running.  The intention is that if someone tries to open the
> > pidfile again while the first process is still running, the lock
> > acquisition will fail and we'll know the other process is still alive,
> > and therefore avoid starting a second instance.
> No, no, you got me wrong. The pidfile is left locked after cron stopped
> running (with /etc/rc.d/cron stop). This behaviour must be wrong.

OK, I misunderstood.  The rc.d script will signal cron to kill it,
which should be closing the file descriptors and causing rpc.lockd to
release the lock.  Perhaps this part is broken.  OK, I tested this
with daemon -p, and it indeed seems to be broken:

haessal# daemon -p pid_file sleep 100000
haessal# kill -KILL `cat pid_file`
haessal# ps -p `cat pid_file`
haessal# lockf -t 0 pid_file echo Yay
lockf: pid_file: already locked

> > There is a (known) lockd bug here though, which you isolated:
> >
> So, this really is bin/80389?

No, I don't think so.  The missing ability to cancel locking requests
(i.e. unkillable process while blocked on a lock) has never been
implemented in FreeBSD's rpc.lockd (I'm not aware of a PR about it, so
I filed my own earlier tonight), and the problem above might be a
separate regression.

> I am a bit disappointed. First, this problem didn't cause me trouble before
> I went to 6-STABLE, now I must either disable cron or disable locking (which
> I can't).
> And I'm still not completely convinced. That problem, if I understand correctly,
> existed before January...

The pidfile_*() functions are new, before that the pidfile handling
was done differently.

> There are two things...
> - cron.pid shouldn't be locked after cron terminated. (this interaction was
> fully saved as http://mega.ist.utl.pt/~mlsr/nfs-nofile.bin)

Actually the locking isn't traced here; I misunderstood how it works,
and the lock transactions are done on another UDP port.  You have to
use rpcinfo to figure out which one it is, since it varies.  Anyway,
the above sequence reproduces it.

> - cron shouldn't hang on startup just because the file is locked, since
> pidfile_open opens it with O_NONBLOCK (unlike lockf).

I haven't been able to reproduce this, e.g. lockf -t 0 does O_NONBLOCK
locking and works correctly when the file is already locked.  Perhaps
it's another locked file (not the pidfile) that was also leaked in the
same way, and is being opened without O_NONBLOCK.

> - cron shouldn't hang in such a way that it is not killable... (and should
> not also the open system call in lockf be interruptible?)

This is the bug (really: missing feature) that I described in my
previous mail.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20060308/d69a9fd4/attachment-0001.bin

More information about the freebsd-stable mailing list