Extending MADV_PROTECT

John Baldwin jhb at freebsd.org
Thu May 9 12:16:31 UTC 2013


On Thursday, May 09, 2013 4:25:38 am Konstantin Belousov wrote:
> On Wed, May 08, 2013 at 12:09:49PM -0400, John Baldwin wrote:
> > On Wednesday, May 08, 2013 5:58:27 am Konstantin Belousov wrote:
> > > On Tue, May 07, 2013 at 02:33:27PM -0400, John Baldwin wrote:
> > > > One of the issues I have with our current MADV_PROTECT is that it
> > > > isn't very administrative-friendly. That is, as a sysadmin I can't
> > > > easily protect arbitrary processes from the OOM killer. Instead, the
> > > > binary has to be changed to invoke madvise(). Furthermore, once the
> > > > protection is granted it can't be revoked. Also, any binaries that
> > > > want this have to be run as root. Instead, I would like to be able
> > > > to both set and revoke this for existing processes and possibly even
> > > > allow it to be inherited (so I can tag a top-level daemon that forks
> > > > and have all its future children be protected for example). To that
> > > > end I've whipped up a simple patch (against 8, but should port to
> > > > HEAD easily if folks think it is a good idea) to add a new pprotect()
> > > > system call and userland program (protect) that can be used similar to
> > > > ktrace(1) either as a modifier when running a new program or as a tool
> > > > for setting or clearing protection for existing processes.
> > > >
> > > > The inherit feature isn't implemented yet, but it should be simple
> > > > to do. One would simply need a new flag that PPROT_INHERIT sets that
> > > > is checked on fork and propagates P_PROTECTED if it is set. Also,
> > > > one other thought I had is that at some point we might want to make
> > > > P_PROTECTED more fine-grained, e.g. by allowing for OOM "priorities".
> > > > To that end, it may make sense to add a new argument to protect,
> > > > though you could also reserve part of the 'op' parameter to encode a
> > > > priority.
> > > 
> > > Wouldn't the pprot_setchildren() miss a child for which the new pid and
> > > struct proc are already allocated in the do_fork(), but which is not yet
> > > linked into the process tree ?  If true, I think this does not
> > > fulfill the promise of the PPROT_DESCEND.
> > 
> > ktrace has the same issue, and really, this is just a race.  If the user
> > had run the command a few nanoseconds earlier the proc wouldn't be allocated
> > at all, and I doubt a user would notice the difference in those two cases.
> > If you are doing this programmatically then that is a race that the program
> > can handle.  It isn't any different from having a new process begin its
> > fork() a few nanoseconds after this returns either.  This is why if you
> > want that behavior you would use -di (and applies equally to ktrace).
> So to get this correct, a person first should enable inheritance, and only
> then turn on the protection on the subtree ? This sounds somewhat sloppy,
> but fine.

Yes, ktrace works the same way.  In practice however, if you know your process
isn't actively forking (e.g. a daemon that forks a child at startup but then
doesn't fork again), you can use -d just fine.

> > > Since the syscall is mean to be extended in the future, would it make
> > > more sense to add a multiplexer, e.g. procctl(2), one operation of which
> > > would be PROCCTL_PROTECT ?
> > 
> > Do we expect it to do more than adjust protection?  We already have a few
> > other process-control system calls (e.g. ptrace()).  It's hard to ensure
> > it is sufficiently generic when only abstracting from one use case.
> 
> You mentioned a priority, and I think ability to pass a structure to the
> sub-function of the syscall is better then carving bits in the int argument,
> or introducing a new syscall.

I think the priority would still be a pprotect operation.  In some ways it would
be nice to be able to do ioctls on processes and maybe this could be structured
similarly?

int procctl(int pid, unsigned long cmd, ...)

(So it's basically ioctl but with the 'fd' replaced with 'pid'.  This would also
mean that in the future with Robert's pdfork() you could perhaps have ioctl on
a process fd just foward the request to procctl).

-- 
John Baldwin


More information about the freebsd-arch mailing list