Re: 13-STABLE high idprio load gives poor responsiveness and excessive CPU time per task

From: Mark Millard <marklmi_at_yahoo.com>
Date: Thu, 29 Feb 2024 16:02:42 UTC
Peter 'PMc' Much <pmc_at_citylink.dinoex.sub.org>wrote on
Date: Thu, 29 Feb 2024 13:40:05 UTC :

> On 2024-02-27, Edward Sanford Sutton, III <mirror176@hotmail.com> wrote:
> > More recently looked and see top showing threads+system processes 
> > shows I have one core getting 100% cpu for kernel{arc_prune} which has 
> > 21.2 hours over a 2 hour 23 minute uptime.
> 
> Ack.
> 
> > I started looking to see if 
> > https://www.freebsd.org/security/advisories/FreeBSD-EN-23:18.openzfs.asc 
> > was available as a fix for 13 but it is not (and doesn't quite sound 
> > like it was supposed to apply to this issue). Would a kernel thread time 
> > at 100% cpu for only 1 core explain the system becoming unusually 
> > unresponsive?
> 
> That depends. This arc_prune issue does usually go alongside with some
> other kernel thread (vm-whatever) also blocking, so you have two cores
> busy. How many remain?
> 
> There is an updated patch in the PR 275594 (5 pieces), that works for
> 13.3; I have it installed, and only with that I am able to build gcc12
> - otherwise the system would just OOM-crash (vm.pageout_oom_seq=5120
> does not help with this).

The kernel has multiple, distinct OOM messages. Which type are you
seeing? :

"failed to reclaim memory"
"a thread waited too long to allocate a page"
"swblk or swpctrie zone exhausted"
"unknown OOM reason %d"

Also, but only for boot verbose:

"proc %d (%s) failed to alloc page on fault, starting OOM\n"



vm.pageout_oom_seq is specific to delaying just:
"failed to reclaim memory"


===
Mark Millard
marklmi at yahoo.com