RE: Why is the process gets killed because "a thread waited too long to allocate a page"?
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 10 Oct 2024 02:21:02 UTC
Yuri <yuri_at_FreeBSD.org> wrote on
Date: Wed, 09 Oct 2024 16:12:50 UTC :
> When I tried to build lang/rust in the 14i386 poudriere VM the compiler
> got killed with this message in the kernel log:
>
>
> > Oct 9 05:21:11 yv kernel: pid 35188 (rustc), jid 1129, uid 65534,
> was killed: a thread waited too long to allocate a page
>
>
>
> The same system has no problem building lang/rust in the 14amd64 VM.
>
>
> What does it mean "waited too long"? Why is the process killed when
> something is slow?
> Shouldn't it just wait instead?
If you want to allow it to potentially wait forever,
you can use:
sysctl vm.pfault_oom_attempts=-1
(or analogous in appropriate *.conf files
taht would later be executed).
You might end up with deadlock/livelock/. . .
if you do so. (I've not analyzed the details.)
Details:
Looking around, sys/vm/vm_pageout.c has:
case VM_OOM_MEM_PF:
reason = "a thread waited too long to allocate a page";
break;
# grep -r VM_OOM_MEM_PF /usr/main-src/sys/
/usr/main-src/sys/vm/vm_pageout.h:#define VM_OOM_MEM_PF 2
/usr/main-src/sys/vm/vm_fault.c: vm_pageout_oom(VM_OOM_MEM_PF);
/usr/main-src/sys/vm/vm_pageout.c: if (shortage == VM_OOM_MEM_PF &&
/usr/main-src/sys/vm/vm_pageout.c: if (shortage == VM_OOM_MEM || shortage == VM_OOM_MEM_PF)
/usr/main-src/sys/vm/vm_pageout.c: case VM_OOM_MEM_PF:
sys/vm/vm_fault.c :
(NOTE: official code has its variant of the printf under a
"if (bootverbose)" but I locally remove that conditional.)
/*
* Initiate page fault after timeout. Returns true if caller should
* do vm_waitpfault() after the call.
*/
static bool
vm_fault_allocate_oom(struct faultstate *fs)
{
struct timeval now;
vm_fault_unlock_and_deallocate(fs);
if (vm_pfault_oom_attempts < 0)
return (true);
if (!fs->oom_started) {
fs->oom_started = true;
getmicrotime(&fs->oom_start_time);
return (true);
}
getmicrotime(&now);
timevalsub(&now, &fs->oom_start_time);
if (now.tv_sec < vm_pfault_oom_attempts * vm_pfault_oom_wait)
return (true);
printf("vm_fault_allocate_oom: proc %d (%s) failed to alloc page on fault, starting OOM\n",
curproc->p_pid, curproc->p_comm);
vm_pageout_oom(VM_OOM_MEM_PF);
fs->oom_started = false;
return (false);
}
This is associated with vm.pfault_oom_attempts and
vm.pfault_oom_wait . An old comment in my
/boot/loader.conf is:
#
# For possibly insufficient swap/paging space
# (might run out), increase the pageout delay
# that leads to Out Of Memory killing of
# processes (showing defaults at the time):
#vm.pfault_oom_attempts= 3
#vm.pfault_oom_wait= 10
# (The multiplication is the total but there
# are other potential tradoffs in the factors
# multiplied, even for nearly the same total.)
(Note: the "tradeoffs" is associated with:
sys/vm/vm_fault.c: vm_waitpfault(dset, vm_pfault_oom_wait * hz);
)
sys/vm/vm_pageout.c :
void
vm_pageout_oom(int shortage)
{
const char *reason;
struct proc *p, *bigproc;
vm_offset_t size, bigsize;
struct thread *td;
struct vmspace *vm;
int now;
bool breakout;
/*
* For OOM requests originating from vm_fault(), there is a high
* chance that a single large process faults simultaneously in
* several threads. Also, on an active system running many
* processes of middle-size, like buildworld, all of them
* could fault almost simultaneously as well.
*
* To avoid killing too many processes, rate-limit OOMs
* initiated by vm_fault() time-outs on the waits for free
* pages.
*/
mtx_lock(&vm_oom_ratelim_mtx);
now = ticks;
if (shortage == VM_OOM_MEM_PF &&
(u_int)(now - vm_oom_ratelim_last) < hz * vm_oom_pf_secs) {
mtx_unlock(&vm_oom_ratelim_mtx);
return;
}
vm_oom_ratelim_last = now;
mtx_unlock(&vm_oom_ratelim_mtx);
. . .
size = vmspace_swap_count(vm);
if (shortage == VM_OOM_MEM || shortage == VM_OOM_MEM_PF)
size += vm_pageout_oom_pagecount(vm);
. . .
Looks like time based retries and giving up after
about the specified overall time for that many
retries, avoiding potentially waiting forever when
0 <= vm.pfault_oom_attempts .
===
Mark Millard
marklmi at yahoo.com