git: e0edb7f3aad5 - stable/12 - ena: do not call reset if device is unresponsive

From: Marcin Wojtas <mw_at_FreeBSD.org>
Date: Thu, 24 Feb 2022 13:04:28 UTC
The branch stable/12 has been updated by mw:

URL: https://cgit.FreeBSD.org/src/commit/?id=e0edb7f3aad5dc4443e921e0bf4e7fbbba3e8f65

commit e0edb7f3aad5dc4443e921e0bf4e7fbbba3e8f65
Author:     Dawid Gorecki <dgr@semihalf.com>
AuthorDate: 2022-01-03 13:50:29 +0000
Commit:     Marcin Wojtas <mw@FreeBSD.org>
CommitDate: 2022-02-24 13:04:05 +0000

    ena: do not call reset if device is unresponsive
    
    If the device becomes unresponsive, the driver will not be able to
    finish the reset process correctly. Timeout during version validation
    indicates that the device is currently not responding. In that case
    do not perform the reset and instead reschedule timer service. Because
    of that the driver will continue trying to reset the device until it
    succeeds or is detached.
    
    Submitted by: Dawid Gorecki <dgr@semihalf.com>
    Obtained from: Semihalf
    MFC after: 2 weeks
    Sponsored by: Amazon, Inc.
    
    (cherry picked from commit d10ec3ad7739a6f621d398d034632f68f647d72f)
---
 sys/dev/ena/ena.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/sys/dev/ena/ena.c b/sys/dev/ena/ena.c
index c069d04874ba..fe3bcac6c6ab 100644
--- a/sys/dev/ena/ena.c
+++ b/sys/dev/ena/ena.c
@@ -3282,6 +3282,18 @@ ena_timer_service(void *data)
 		ena_update_host_info(host_info, adapter->ifp);
 
 	if (unlikely(ENA_FLAG_ISSET(ENA_FLAG_TRIGGER_RESET, adapter))) {
+		/*
+		 * Timeout when validating version indicates that the device
+		 * became unresponsive. If that happens skip the reset and
+		 * reschedule timer service, so the reset can be retried later.
+		 */
+		if (ena_com_validate_version(adapter->ena_dev) ==
+		    ENA_COM_TIMER_EXPIRED) {
+			ena_log(adapter->pdev, WARN,
+			    "FW unresponsive, skipping reset\n");
+			ENA_TIMER_RESET(adapter);
+			return;
+		}
 		ena_log(adapter->pdev, WARN, "Trigger reset is on\n");
 		taskqueue_enqueue(adapter->reset_tq, &adapter->reset_task);
 		return;