Re: Can not build kernel on 1GB VM
- In reply to: Mark Millard : "Re: Can not build kernel on 1GB VM"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Fri, 15 Apr 2022 20:02:47 UTC
On 2022-Apr-15, at 11:40, Mark Millard <marklmi@yahoo.com> wrote:
> From: Michael Wayne <freebsd07_at_wayne47.com>
> Date: Fri, 15 Apr 2022 13:49:53 -0400 :
>
>> I have a VM with 1GB RAM running FreeBSD 12.1-RELEASE-p3
>>
>> I'm trying to upgrade the machine to 12.3 and having swap failures.
>>
>> This machine runs bird to advertise BGP, ssh and not much else so
>> the small amount of RAM is (usually) fine.
>>
>> For a long time, there was a 1 GB swap file which handled the
>> occasional time when excess memory got used.
>>
>> Machine needs a custom kernel for BGP, the conf file consists of:
>> include GENERIC
>> ident ROUTING
>> options TCP_SIGNATURE
>>
>>
>> Today, while building the 12.3 kernel with:
>> cd /usr/src
>> sudo make toolchain
>> sudo make buildkernel KERNCONF=ROUTING
>> the machine ran out of swap. with a bunch of messages like:
>> Apr 15 12:11:26 g1 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 240593, size: 4096
>> Apr 15 12:11:35 g1 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 236224, size: 16384
>> Apr 15 12:11:37 g1 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 245, size: 12288
>> Apr 15 12:11:46 g1 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 240593, size: 4096
>> Apr 15 12:11:55 g1 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 236224, size: 16384
>> Apr 15 12:11:57 g1 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 245, size: 12288
>>
>> Thinking it was a sawp space issue, I increased the swap to 4 GB and
>> tried again with the same results. Boot gave the kern.maxswzone message,
>> I ignored it as I had planned to change as soon as I completed the build.
>>
>> So I pulled up top in a console window and watched swap during the
>> build. About 400 MB of RAM was free and about 3 MB of swap was
>> used when the machine started linking the kernel:
>> ctfmerge -L VERSION -g -o kernel.full ...
>> While this command was running, I saw swap usage go to ~5MB (so
>> just over 1%), then started seeing processes being killed due to
>> out of swap space.
>
> The "out of swap space" message is usually a misnomer
I should have been explicit that the misnomer messages are
when it is part of a OOM kill notification message.
There is a separate message about "out of swap space" that
is just a notification of that status. This message is not
a misnomer and need not imply that I OOM kill will or has
happened.
> and has
> been replaced in main [so: 14], stable/13 , and releng/13.1 :
>
> case VM_OOM_MEM:
> reason = "failed to reclaim memory";
> break;
> case VM_OOM_MEM_PF:
> reason = "a thread waited too long to allocate a page";
> break;
>
> (There is one more case that still has the misnomer but
> case VM_OOM_SWAPZ seems unlikely to actually happen.)
>
> Given that you are getting the swap_pager: indefinite wait buffer
> notices I can not tell which of the two above is happening.
>
>> So, how to proceed?
>
> My /boot/loader/conf has the likes of:
>
> # Delay when persistent low free RAM leads to
> # Out Of Memory killing of processes:
> vm.pageout_oom_seq=120
> #
> # For plunty of swap/paging space (will not
> # run out), avoid pageout delays leading to
> # Out Of Memory killing of processes:
> vm.pfault_oom_attempts=-1
> #
> # For possibly insufficient swap/paging space
> # (might run out), increase the pageout delay
> # that leads to Out Of Memory killing of
> # processes (showing defaults at the time):
> #vm.pfault_oom_attempts= 3
> #vm.pfault_oom_wait= 10
> # (The multiplication is the total but there
> # are other potential tradoffs in the factors
> # multiplied, even for nearly the same total.)
>
> The vm.pageout_oom_seq=120 delays VM_OOM_MEM.
> The vm.pfault_oom_attempts=-1 avoids VM_OOM_MEM_PF.
>
> Note: vm.pfault_oom_attempts=-1 can lead to deadlock
> if you actually run out of swap as I understand.
>
> You could try setting both vm.pfault_oom_attempts and
> vm.pfault_oom_wait but I've no specific suggested
> values for your context.
>
>
> Note: I do not recommend having so much swap that
> you get the the kern.maxswzone message. I do not
> recommend adjusting kern.maxswzone as it competes
> with other kernel resources --unless you understand
> the tradeoffs in fair detail. (I do not understand
> them in much detail.)
>
FYI: "swap_pager: indefinite wait buffer" is for
a swap read taking over 20 seconds (at least
in main [so: 14]):
/*
* Wait for the pages we want to complete. VPO_SWAPINPROG is always
* cleared on completion. If an I/O error occurs, SWAPBLK_NONE
* is set in the metadata for each page in the request.
*/
VM_OBJECT_WLOCK(object);
/* This could be implemented more efficiently with aflags */
while ((ma[0]->oflags & VPO_SWAPINPROG) != 0) {
ma[0]->oflags |= VPO_SWAPSLEEP;
VM_CNT_INC(v_intrans);
if (VM_OBJECT_SLEEP(object, &object->handle, PSWP,
"swread", hz * 20)) {
printf(
"swap_pager: indefinite wait buffer: bufobj: %p, blkno: %jd, size: %ld\n",
bp->b_bufobj, (intmax_t)bp->b_blkno, bp->b_bcount);
}
}
VM_OBJECT_WUNLOCK(object);
Also, for reference:
# sysctl -d vm.pageout_oom_seq vm.pfault_oom_attempts vm.pfault_oom_wait
vm.pageout_oom_seq: back-to-back calls to oom detector to start OOM
vm.pfault_oom_attempts: Number of page allocation attempts in page fault handler before it triggers OOM handling
vm.pfault_oom_wait: Number of seconds to wait for free pages before retrying the page fault handler
The default for vm.pageout_oom_seq was 12 last I checked.
===
Mark Millard
marklmi at yahoo.com