Re: ena(4) tx timeout messages in dmesg

Go to: [ bottom of page ] [ top of archives ] [ this month ]
From: Pete Wright <pete_at_nomadlogic.org>
Date: Mon, 12 May 2025 18:25:43 UTC

On 5/12/25 11:17, Colin Percival wrote:
> [+ akiyano, maintainer of the ena(4) driver]
> 
> On 5/12/25 11:04, Pete Wright wrote:
>> hey there - i have an ec2 instance that i'm using as a nfs server and 
>> have noticed the following messages in my dmesg buffer:
>>
>> ena0: Found a Tx that wasn't completed on time, qid 2, index 593. 10 
>> msecs have passed since last cleanup. Missing Tx timeout value 5000 
>> msecs.
>> ena0: Found a Tx that wasn't completed on time, qid 2, index 220. 1 
>> msecs have passed since last cleanup. Missing Tx timeout value 5000 
>> msecs.
>> ena0: Found a Tx that wasn't completed on time, qid 3, index 240. 1 
>> msecs have passed since last cleanup. Missing Tx timeout value 5000 
>> msecs.
>> ena0: Found a Tx that wasn't completed on time, qid 3, index 974. 1 
>> msecs have passed since last cleanup. Missing Tx timeout value 5000 
>> msecs.
>> ena0: Found a Tx that wasn't completed on time, qid 2, index 730. 1 
>> msecs have passed since last cleanup. Missing Tx timeout value 5000 
>> msecs.
>> ena0: Found a Tx that wasn't completed on time, qid 2, index 864. 10 
>> msecs have passed since last cleanup. Missing Tx timeout value 5000 
>> msecs.
>> ena0: Found a Tx that wasn't completed on time, qid 3, index 998. 1 
>> msecs have passed since last cleanup. Missing Tx timeout value 5000 
>> msecs.
>>
>> the system is not overly loaded, but does have a steady %25 CPU usage 
>> and averages around 2MB/sec network throughput (the system serves a 
>> python virtual-environment to a cluster of data processing systems).
>>
>> The man page states: "Packet was pushed to the NIC but not sent within 
>> given time limit.  It may be caused by hang of the IO queue."
>>
>> I was curious if anyone had any idea if these messages indicate a 
>> poorly tuned system, or are they just informational.  Looking at the 
>> basics like mbuf's and other base metrics and the system looks OK from 
>> that perspective.
> 
> I've heard that this can be caused by a thread being starved for CPU, 
> possibly
> due to FreeBSD kernel scheduler issues, but that was on a far more heavily
> loaded system.  What instance type are you running on?
> 

oh of course, forgot to provide useful info:

# uname -ar
FreeBSD airflow-nfs.q0.ringdna.net 14.2-RELEASE-p1 FreeBSD 
14.2-RELEASE-p1 GENERIC amd64

Instance type:
t3a.xlarge

I also verified I have plenty of available "burstable credit" available 
since this is a t class system (current balance is steady at 2,300 credits).

the exported filesystem resides on a dedicated zfs pool, and the dataset 
itself can reside fully in memory as such there is basically zero disk 
i/o happening while serving %99 reads from nfsd.  the virtual-env is ~500MB

thanks!
-pete



-- 
Pete Wright
pete@nomadlogic.org