git: 893be9d8ac16 - main - sleepqueue: Address a lock order reversal
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Mon, 14 Feb 2022 15:07:02 UTC
The branch main has been updated by markj:
URL: https://cgit.FreeBSD.org/src/commit/?id=893be9d8ac161c4cc96e9f3f12f1260355dd123b
commit 893be9d8ac161c4cc96e9f3f12f1260355dd123b
Author: Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2022-02-14 14:38:53 +0000
Commit: Mark Johnston <markj@FreeBSD.org>
CommitDate: 2022-02-14 15:06:47 +0000
sleepqueue: Address a lock order reversal
After commit 74cf7cae4d22 ("softclock: Use dedicated ithreads for
running callouts."), there is a lock order reversal between the per-CPU
callout lock and the scheduler lock. softclock_thread() locks callout
lock then the scheduler lock, when preparing to switch off-CPU, and
sleepq_remove_thread() stops the timed sleep callout while potentially
holding a scheduler lock. In the latter case, it's the thread itself
that's locked, and if the thread is sleeping then its lock will be a
sleepqueue lock, but if it's still in the process of going to sleep
it'll be a scheduler lock.
We could perhaps change softclock_thread() to try to acquire locks in
the opposite order, but that'd require dropping and re-acquiring the
callout lock, which seems expensive for an operation that will happen
quite frequently. We can instead perhaps avoid stopping the
td_slpcallout callout if the thread is still going to sleep, which is
what this patch does. This will result in a spurious call to
sleepq_timeout(), but some counters suggest that this is very rare.
PR: 261198
Fixes: 74cf7cae4d22 ("softclock: Use dedicated ithreads for running callouts.")
Reported and tested by: thj
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34204
---
sys/kern/subr_sleepqueue.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/sys/kern/subr_sleepqueue.c b/sys/kern/subr_sleepqueue.c
index 36832ef96ba4..af5a001b46fb 100644
--- a/sys/kern/subr_sleepqueue.c
+++ b/sys/kern/subr_sleepqueue.c
@@ -833,7 +833,8 @@ sleepq_remove_thread(struct sleepqueue *sq, struct thread *td)
td->td_sleepqueue = LIST_FIRST(&sq->sq_free);
LIST_REMOVE(td->td_sleepqueue, sq_hash);
- if ((td->td_flags & TDF_TIMEOUT) == 0 && td->td_sleeptimo != 0)
+ if ((td->td_flags & TDF_TIMEOUT) == 0 && td->td_sleeptimo != 0 &&
+ td->td_lock == &sc->sc_lock) {
/*
* We ignore the situation where timeout subsystem was
* unable to stop our callout. The struct thread is
@@ -843,8 +844,16 @@ sleepq_remove_thread(struct sleepqueue *sq, struct thread *td)
* sleepq_timeout() ensure that the thread does not
* get spurious wakeups, even if the callout was reset
* or thread reused.
+ *
+ * We also cannot safely stop the callout if a scheduler
+ * lock is held since softclock_thread() forces a lock
+ * order of callout lock -> scheduler lock. The thread
+ * lock will be a scheduler lock only if the thread is
+ * preparing to go to sleep, so this is hopefully a rare
+ * scenario.
*/
callout_stop(&td->td_slpcallout);
+ }
td->td_wmesg = NULL;
td->td_wchan = NULL;