git: b411372b7d17 - main - mpi3mr: Don't hold fwevt_lock over call to taskqueue_drain

From: Warner Losh <imp_at_FreeBSD.org>
Date: Wed, 29 Nov 2023 01:55:30 UTC
The branch main has been updated by imp:

URL: https://cgit.FreeBSD.org/src/commit/?id=b411372b7d17ae7e5d6c944732d41b979bde2ac4

commit b411372b7d17ae7e5d6c944732d41b979bde2ac4
Author:     Warner Losh <imp@FreeBSD.org>
AuthorDate: 2023-11-29 01:48:48 +0000
Commit:     Warner Losh <imp@FreeBSD.org>
CommitDate: 2023-11-29 01:48:48 +0000

    mpi3mr: Don't hold fwevt_lock over call to taskqueue_drain
    
    Holding fwevt_lock when we call taskqueue_drain can lead to deadlock
    because it's draining a queue needs fwevt_lock to do work, so that other
    thread will try to take out the lock and block, making the thread never
    finish and taskqueue_drain never complete. There's a witness
    warning/error for this which was exposed when the lock was converted to
    a MTX_DEF lock from a MTX_SPIN prior to committing to the FreeBSD tree.
    
    The lock appears to be to protect against additional items being added
    to the event list while we're doing a reset. Since the taskqueue is
    blocked, items can get added to the list, but won't be processed during
    the reset, but there is still a (likely small) race between the
    taskqueue_drain and the taskqueue_block calls where an interrupt could
    fire on another CPU, resulting in a task being enqueued and started
    before the block can take effect. The only way to fix that race is to
    turn off interrupt processing during a reset. So we replace a deadlock
    with a smaller race.
    
    Sponsored by:           Netflix
    Reviewed by:            sumit.saxena_broadcom.com, mav, jhb
    Differential Revision:  https://reviews.freebsd.org/D42537
---
 sys/dev/mpi3mr/mpi3mr.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/sys/dev/mpi3mr/mpi3mr.c b/sys/dev/mpi3mr/mpi3mr.c
index 776bfbdaee23..478b0944defa 100644
--- a/sys/dev/mpi3mr/mpi3mr.c
+++ b/sys/dev/mpi3mr/mpi3mr.c
@@ -2784,7 +2784,6 @@ int mpi3mr_initialize_ioc(struct mpi3mr_softc *sc, U8 init_type)
 		mtx_init(&sc->sense_buf_q_lock, "Sense buffer Queue lock", NULL, MTX_SPIN);
 		mtx_init(&sc->chain_buf_lock, "Chain buffer lock", NULL, MTX_SPIN);
 		mtx_init(&sc->cmd_pool_lock, "Command pool lock", NULL, MTX_DEF);
-//		mtx_init(&sc->fwevt_lock, "Firmware Event lock", NULL, MTX_SPIN);
 		mtx_init(&sc->fwevt_lock, "Firmware Event lock", NULL, MTX_DEF);
 		mtx_init(&sc->target_lock, "Target lock", NULL, MTX_SPIN);
 		mtx_init(&sc->reset_mutex, "Reset lock", NULL, MTX_DEF);
@@ -5825,11 +5824,14 @@ static int mpi3mr_issue_reset(struct mpi3mr_softc *sc, U16 reset_type,
 
 inline void mpi3mr_cleanup_event_taskq(struct mpi3mr_softc *sc)
 {
-	mtx_lock(&sc->fwevt_lock);
-	taskqueue_drain(sc->cam_sc->ev_tq, &sc->cam_sc->ev_task);
+	/*
+	 * Block the taskqueue before draining. This means any new tasks won't
+	 * be queued to a worker thread. But it doesn't stop the current workers
+	 * that are running. taskqueue_drain waits for those correctly in the
+	 * case of thread backed taskqueues.
+	 */
 	taskqueue_block(sc->cam_sc->ev_tq);
-	mtx_unlock(&sc->fwevt_lock);
-	return;
+	taskqueue_drain(sc->cam_sc->ev_tq, &sc->cam_sc->ev_task);
 }
 
 /**