mlock(2) for ordinary users

Sat Jul 22 15:16:38 UTC 2006

On Sat, Jul 22, 2006 at 03:52:37PM +0100, Robert Watson wrote:
> 
> On Fri, 21 Jul 2006, Peter Jeremy wrote:
> 
> >Currently mlock() and munlock() are restricted to the root user - which 
> >prevents an ordinary user locking their process into RAM to the detriment 
> >of the system as a whole.  Whilst this is a valid concern, there are good 
> >security reasons for allowing a user to lock small amounts of memory (a 
> >few pages) to ensure that sensitive information (private keys, passwords 
> >etc) don't wind up on swap devices.
> >
> >There is a resource limit for locked pages (RLIMIT_MEMLOCK) and, despite 
> >the man page, a quick look at the code implies that it really is honoured. 
> >Could someone with more VM-foo please confirm whether the last line of the 
> >man page is still correct.
> >
> >I would like to suggest that the suser() tests in mlock() and munlock() be 
> >removed and the default RLIMIT_MEMLOCK is reduced from infinity to (say) 
> >1. The only gotcha I can see is that lots of sysctl() functions use 
> >RLIMIT_MEMLOCK via sysctl_wire_old_buffer() and vslock().
> 
> I think I'd like to see the functionality you suggest -- i.e., the ability 
> to allocate pinned memory pages to unprivileged processes.  However, I have 
> to wonder about whether this isn't already enabled for a reason -- in 
> particular, I have to wonder if it works at all.  The whole idea of 
> resources limits is that you bill new use to a credential, and credit 
> reduced use to a similar credential.  Probably, we're interested only in 
> memory pinned at the request of the process, not memory pinned by the 
> kernel on its behalf.  The normal questions I'd try to answer about whether 
> it works currently are:
> 
> - When pages become locked on behalf of a credential, is it correctly billed
>   to the credential?
> 
> - When pages become unlocked (or are released), are any credentials that 
> have
>   requested it be locked credited?
> 
> - What happens when the credential on a process changes between when memory
>   is locked and unlocked?
> 
> - What happens if more than one credential requests the same page of memory 
> be
>   locked and unlocked?
> 
> - Is locked memory properly credited back to the credential on process exit
>   and other non-explicit unmapping points?
> 
> Note in particular that more than one credential can request that the same 
> page be locked -- if two processes map the same page from a file, or one is 
> a fork of the other and has inheritted a shared mapping, we need to handle 
> that "correctly".  And we need to handle cases like setuid -- as with other 
> resource limit implementations, the right credential needs to be credited. 
> In the case of socket limits, for example, we actually keep a reference to 
> the allocating credential in the struct socket so that when the socket is 
> freed, we can credit the resources back to the original credential, not to 
> the credential of whatever process last references the socket.  Presumably 
> something similar would be required here, and a quick glance doesn't 
> suggest this is implemented.

As far as I remember, RLIMIT_MEMLOCK is per-process instead of per-cred.
As consequence, allowing mlock() for non-root users actually allow such
user to allocate value-of(RLIMIT_MEMLOCK) * value-of(RLIMIT_NPROC).

In fact, I had to make the answers to the asked questions when I
implemented the per-user swap limits. The design I ended with was to
add reference to the originating cred to vm_map_entry and vm_object
(with somewhat complicated logic to move the ref from entry to object
on occasion).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-arch/attachments/20060722/58a58017/attachment.pgp