git: dce565949914 - main - epoch: Don't idle CPUs when there's pending epoch work
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Tue, 21 Apr 2026 16:13:42 UTC
The branch main has been updated by markj:
URL: https://cgit.FreeBSD.org/src/commit/?id=dce56594991464c276f340ce963d0f5461566c78
commit dce56594991464c276f340ce963d0f5461566c78
Author: Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2026-04-21 14:28:31 +0000
Commit: Mark Johnston <markj@FreeBSD.org>
CommitDate: 2026-04-21 16:13:19 +0000
epoch: Don't idle CPUs when there's pending epoch work
The epoch(9) subsystem implements per-CPU queues of object destructors
which get invoked once it is safe to do so. These queues are polled via
hardclock().
When a CPU is about to go idle, we reduce the hardclock frequency to 1Hz
by default, to avoid unneeded wakeups. This means that if there is any
garbage in these destructor queues, it won't be cleared for at least 1s
(and possibly longer) even if it would otherwise be safe to do so.
epoch_drain_callbacks() is used in some places to provide a barrier,
ensuring that all garbage present in the destructor queues is cleaned up
before returning. It's implemented by adding a fake destructor in the
queues and blocking until it gets run on all CPUs. The above-described
phenomenon means that it can take a long time for these calls to return,
even (especially) when some CPUs are idle. This causes long delays when
destroying VNET jails, for instance, as epoch_drain_callbacks() is
invoked each time a network interface is destroyed.
Work around this problem by not disabling the hardclock timer if there
is garbage present in the destructor queues. The implementation of
epoch_drain_callbacks() has other problems, but this small change on its
own gives a good improvement, especially when running networking
regression tests. Moreover, we should aim to invoke destructors in a
timely manner, so the change is generally beneficial.
Reviewed by: glebius
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D56508
---
sys/kern/kern_clocksource.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/sys/kern/kern_clocksource.c b/sys/kern/kern_clocksource.c
index 6bf3bbd83245..637610654648 100644
--- a/sys/kern/kern_clocksource.c
+++ b/sys/kern/kern_clocksource.c
@@ -36,6 +36,7 @@
#include <sys/param.h>
#include <sys/systm.h>
#include <sys/bus.h>
+#include <sys/epoch.h>
#include <sys/limits.h>
#include <sys/lock.h>
#include <sys/kdb.h>
@@ -235,7 +236,7 @@ getnextcpuevent(struct pcpu_state *state, int idle)
/* Handle hardclock() events, skipping some if CPU is idle. */
event = state->nexthard;
- if (idle) {
+ if (idle && DPCPU_GET(epoch_cb_count) == 0) {
if (tc_min_ticktock_freq > 1
#ifdef SMP
&& curcpu == CPU_FIRST()