git: 8875a5872c1a - stable/13 - ena: do not call reset if device is unresponsive

From: Marcin Wojtas <mw_at_FreeBSD.org>
Date: Thu, 24 Feb 2022 12:55:48 UTC
The branch stable/13 has been updated by mw:

URL: https://cgit.FreeBSD.org/src/commit/?id=8875a5872c1a7933bd8e6a248adc05053547ad54

commit 8875a5872c1a7933bd8e6a248adc05053547ad54
Author:     Dawid Gorecki <dgr@semihalf.com>
AuthorDate: 2022-01-03 13:50:29 +0000
Commit:     Marcin Wojtas <mw@FreeBSD.org>
CommitDate: 2022-02-24 12:53:44 +0000

    ena: do not call reset if device is unresponsive
    
    If the device becomes unresponsive, the driver will not be able to
    finish the reset process correctly. Timeout during version validation
    indicates that the device is currently not responding. In that case
    do not perform the reset and instead reschedule timer service. Because
    of that the driver will continue trying to reset the device until it
    succeeds or is detached.
    
    Submitted by: Dawid Gorecki <dgr@semihalf.com>
    Obtained from: Semihalf
    MFC after: 2 weeks
    Sponsored by: Amazon, Inc.
    
    (cherry picked from commit d10ec3ad7739a6f621d398d034632f68f647d72f)
---
 sys/dev/ena/ena.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/sys/dev/ena/ena.c b/sys/dev/ena/ena.c
index f4abe61f08ae..1b26a91c5d9e 100644
--- a/sys/dev/ena/ena.c
+++ b/sys/dev/ena/ena.c
@@ -3278,6 +3278,18 @@ ena_timer_service(void *data)
 		ena_update_host_info(host_info, adapter->ifp);
 
 	if (unlikely(ENA_FLAG_ISSET(ENA_FLAG_TRIGGER_RESET, adapter))) {
+		/*
+		 * Timeout when validating version indicates that the device
+		 * became unresponsive. If that happens skip the reset and
+		 * reschedule timer service, so the reset can be retried later.
+		 */
+		if (ena_com_validate_version(adapter->ena_dev) ==
+		    ENA_COM_TIMER_EXPIRED) {
+			ena_log(adapter->pdev, WARN,
+			    "FW unresponsive, skipping reset\n");
+			ENA_TIMER_RESET(adapter);
+			return;
+		}
 		ena_log(adapter->pdev, WARN, "Trigger reset is on\n");
 		taskqueue_enqueue(adapter->reset_tq, &adapter->reset_task);
 		return;