rpc.lockd brokenness (2)

Thu Mar 9 02:03:56 UTC 2006

On Wed, Mar 08, 2006 at 07:57:22PM -0500, Kris Kennaway wrote:
> On Thu, Mar 09, 2006 at 12:26:44AM +0000, Miguel Lopes Santos Ramos wrote:
> > > From: Kris Kennaway <kris at obsecurity.org>
> > > Subject: Re: rpc.lockd brokenness (2)
> > >
> > > This is intentional.  It's how pidfile_*() tests whether the process
> > > is still running.  The intention is that if someone tries to open the
> > > pidfile again while the first process is still running, the lock
> > > acquisition will fail and we'll know the other process is still alive,
> > > and therefore avoid starting a second instance.
> > 
> > No, no, you got me wrong. The pidfile is left locked after cron stopped
> > running (with /etc/rc.d/cron stop). This behaviour must be wrong.
> 
> OK, I misunderstood.  The rc.d script will signal cron to kill it,
> which should be closing the file descriptors and causing rpc.lockd to
> release the lock.  Perhaps this part is broken.  OK, I tested this
> with daemon -p, and it indeed seems to be broken:
> 
> haessal# daemon -p pid_file sleep 100000
> haessal# kill -KILL `cat pid_file`
> haessal# ps -p `cat pid_file`
>   PID  TT  STAT      TIME COMMAND
> haessal# lockf -t 0 pid_file echo Yay
> lockf: pid_file: already locked

The bug is triggered because the file is locked in the parent
(i.e. the daemon process, which creates the pidfile) but unlocked by
the child after the fork (in this case, when the child is killed).  On
the server, rpc.lockd compares the svid (= pid of process on the
client that is doing the lock call) of the lock and unlock requests,
notices they're different and assumes that the unlock request is
coming from some random process on the client that didn't hold the
lock in the first place.

In reality, the file descriptor was passed from parent to child by the
fork(), and the child does actually hold the lock.

Fixing this is probably hard (also: I can't see how this could have
ever worked with pidfile locking in cron, since it always acquired the
lock before forking, as now.  Perhaps something else about your
configuration changed.).

Anyway, the workaround for you is probably not to use rpc.lockd on
your NFS mounted /var (e.g. use mount_nfs -L).  Since you don't have
multiple machines accessing this filesystem (which wouldn't work
anyway, as noted before), you don't need it anyway.

Kris
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20060308/beabec14/attachment.bin