svn commit: r267760 - head/sys/kern

Mon Jun 23 16:35:31 UTC 2014

On Mon, Jun 23, 2014 at 03:16:53PM +0200, Mateusz Guzik wrote:
> On Mon, Jun 23, 2014 at 11:18:23AM +0300, Konstantin Belousov wrote:
> > On Mon, Jun 23, 2014 at 10:05:01AM +0200, Mateusz Guzik wrote:
> > > On Mon, Jun 23, 2014 at 10:25:19AM +0300, Konstantin Belousov wrote:
> > > > On Mon, Jun 23, 2014 at 09:06:52AM +0200, Mateusz Guzik wrote:
> > > > > The table is modified in these functions and is reachable from the rest
> > > > > of the kernel (can be found by e.g. sysctl_kern_proc_filedesc), thus
> > > > > XLOCK is needed to ensure consistency for readers. It can also be
> > > > > altered by mountcheckdirs, although not in a way which disrupts any of
> > > > > these functions.
> > > > I would think that such cases should be avoided by testing for P_INEXEC,
> > > > but I do not insist on this.
> > > > 
> > > 
> > > proc lock has to be dropped before filedesc lock is taken and I don't
> > > see any way to prevent the proc from transitioning to P_INEXEC afterwards.
> > > Since sysctl_kern_proc_filedesc et al can take a long time it does not
> > > seem feasible to pursue this.
> > > 
> 
> After a second look this problem has to be dealt with.
> 
> If traversal while transition to P_INEXEC is allowed, execve dealing
> with a setuid binary is problematic. This is more of hypothetical nature,
> but with sufficienly long delay it could finish the syscall and start
> opening some files, which paths would now be visible for an unprivileged
> reader.
> 
> That said, I propose adding a counter to struct proc which would which
> would block execve. It would be quite similar to p_lock.
I thought about this too.  In fact, I considered using PHOLD for this.

> 
> iow execve would:
> 
>         PROC_LOCK(p);
> 	p->p_flag |= P_INEXEC; 
>         while (p->p_execlock > 0)
>                 msleep(&p->p_execlock, &p->p_mtx, PWAIT, "execlock", 0);
> 	PROC_UNLOCK(p);
> 
> And it would be mandatory for external fdp consumers to grab the counter.
> 
> I'm tempted to add P_GETPIN which would both increase p_lock and p_execlock,
> that way the process is guaranteed not to exit and not to execve even
> after proc lock is dropped.
See above about PHOLD.

> 
> There is a separate question if p_execlock should be renamed and
> extended to also block any kind of credential changes.
> 
> Then the guarantee is even stronger since we know that credentials we
> checked against are not going to change for the duration of our
> operations, but it is unclear if we need this.

If doing separate execlock/p_lock, I think that it could be possible
to use per-process sx lock instead of hand-rolling the counter.  The
accessors would lock sx shared, while kern_execve would take it in
exclusive mode.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/svn-src-head/attachments/20140623/6f3160ae/attachment.sig>