Fix for sys_munlock(2) with racct
Jeremie Le Hen
jlh at FreeBSD.org
Sun Jul 21 21:50:46 UTC 2013
On Sat, Jul 20, 2013 at 08:33:35PM -0700, Alan Cox wrote:
>
> On Jul 20, 2013, at 4:22 AM, Jeremie Le Hen wrote:
>
> > Hi Edward, Alan,
> >
> > I plan to commit the following patch:
> > http://people.freebsd.org/~jlh/racct_munlock.diff
> >
> > This solves the following panic:
> >
> > panic: racct_sub: freeing 301989888 of resource 5, which is more than allocated 73728 for pwsafe (pid 4177)
> >
> > The problem is that the racct code in sys_munlock() trusts too much the
> > user's input. vm_map_unwire_count() now returns how much memory has
> > really been unwired.
> >
> > Any objection?
>
>
> Can you elaborate on what you mean by "sys_munlock() trusts too much
> the user's input." munlock(2) is supposed to return ENOMEM if any
> addresses within the specified range are not backed by valid mappings.
> (Valid mappings with PROT_NONE access are something of a gray area
> here.) However, it is not error for a to call munlock() on a range
> that isn't locked, or to call it a second, third, etc. time on the
> same range. Is that what is provoking your panic?
I'm using sysutils/pwsafe and it seems to mlock(2) 18 pages (end - start
= 73728 bytes) but then munlock(2) 73728 pages :-). vm_map_unwire()
seems to do the right thing with those buggy boundaries, but then the
racct code assumes that those boundaries were correct as long as
vm_map_unwire() returned successfully. This is what causes the
assertion failure.
> By the way, sys_mlock() uses a simpler approach to deal with a similar
> situation:
>
> error = vm_map_wire(map, start, end,
> VM_MAP_WIRE_USER | VM_MAP_WIRE_NOHOLES);
> #ifdef RACCT
> if (error != KERN_SUCCESS) {
> PROC_LOCK(proc);
> racct_set(proc, RACCT_MEMLOCK,
> ptoa(pmap_wired_count(map->pmap)));
> PROC_UNLOCK(proc);
> }
> #endif
>
> However, the code in sys_mlock() for maintaining RACCT_MEMLOCK,
> including the above snippet, is race-y. Two simultaneous callers to
> sys_mlock() will produce incorrect results. (I haven't looked at
> sys_mlockall() or vm_map_growstack().)
Ok, I will commit that as the first bandaid then.
> Also, a wired mapping can be destroyed by calling munmap(2) without
> first calling munlock(2), in which case, RACCT_MEMLOCK will be
> incorrect.
So I think the right way to tackle this is to handle racct in the vm
layer rather than at the syscall layer.
vm_fault_{wire,unwire}() look like natural places for this, but racct
functions require the struct proc to be locked. Any idea how to handle
this?
--
Jeremie Le Hen
Scientists say the world is made up of Protons, Neutrons and Electrons.
They forgot to mention Morons.
More information about the freebsd-current
mailing list