[Bug 261198] bhyve host panics with: spin lock 0xffffffff81eac800 (callout) helpanic: spin lock held too long

From: <bugzilla-noreply_at_freebsd.org>
Date: Mon, 14 Feb 2022 15:08:03 UTC

--- Comment #9 from commit-hook@FreeBSD.org ---
A commit in branch main references this bug:


commit 893be9d8ac161c4cc96e9f3f12f1260355dd123b
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2022-02-14 14:38:53 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2022-02-14 15:06:47 +0000

    sleepqueue: Address a lock order reversal

    After commit 74cf7cae4d22 ("softclock: Use dedicated ithreads for
    running callouts."), there is a lock order reversal between the per-CPU
    callout lock and the scheduler lock.  softclock_thread() locks callout
    lock then the scheduler lock, when preparing to switch off-CPU, and
    sleepq_remove_thread() stops the timed sleep callout while potentially
    holding a scheduler lock.  In the latter case, it's the thread itself
    that's locked, and if the thread is sleeping then its lock will be a
    sleepqueue lock, but if it's still in the process of going to sleep
    it'll be a scheduler lock.

    We could perhaps change softclock_thread() to try to acquire locks in
    the opposite order, but that'd require dropping and re-acquiring the
    callout lock, which seems expensive for an operation that will happen
    quite frequently.  We can instead perhaps avoid stopping the
    td_slpcallout callout if the thread is still going to sleep, which is
    what this patch does.  This will result in a spurious call to
    sleepq_timeout(), but some counters suggest that this is very rare.

    PR:             261198
    Fixes:          74cf7cae4d22 ("softclock: Use dedicated ithreads for
running callouts.")
    Reported and tested by: thj
    Reviewed by:    kib
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D34204

 sys/kern/subr_sleepqueue.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

You are receiving this mail because:
You are the assignee for the bug.