[Bug 234838] ena drop-outs on 12.0-RELEASE

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Tue Oct 22 18:41:47 UTC 2019


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=234838

Ryan Langseth <langseth at iteris.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |langseth at iteris.com

--- Comment #6 from Ryan Langseth <langseth at iteris.com> ---
It seems like there is still an issue with this. We are running FreeBSD
12.0-RELEASE-p10 on a c5.2xl instance. And have had the system reset the
network device twice in the last 24 hours.

The current traffic to it is a zfs recv over ssh running at ~40MiB/s. The first
time it dropped out it was just the ena device. The second time I also got nvme
'Missing Interrupts' as well. The system has 6 gp2 volumes for the zpool.

`grep kern.crit /var/log/messages`
Oct 21 12:26:52 <kern.crit> apache-00 kernel: Trying to mount root from
ufs:/dev/gpt/rootfs [rw]...
Oct 21 12:26:52 <kern.crit> apache-00 kernel: ena0: device is going UP
Oct 21 12:26:52 <kern.crit> apache-00 kernel: ena0: device is going DOWN
Oct 21 12:26:52 <kern.crit> apache-00 kernel: ena0: device is going UP
Oct 21 12:26:52 <kern.crit> apache-00 kernel: intsmb0: <Intel PIIX4 SMBUS
Interface> port 0xb100-0xb10f at device 1.3 on pci0
Oct 21 12:26:52 <kern.crit> apache-00 kernel: intsmb0: intr IRQ 9 enabled
revision 255
Oct 21 12:26:52 <kern.crit> apache-00 kernel: smbus0: <System Management Bus>
on intsmb0
Oct 21 12:26:53 <kern.crit> apache-00 kernel: Security policy loaded: MAC/ntpd
(mac_ntpd)

Oct 22 06:33:55 <kern.crit> apache-00 kernel: ena0: The number of lost tx
completion is above the threshold (129 > 128). Reset the device
Oct 22 06:33:55 <kern.crit> apache-00 kernel: ena0: Trigger reset is on
Oct 22 06:33:55 <kern.crit> apache-00 kernel: ena0: device is going DOWN
Oct 22 06:34:02 <kern.crit> apache-00 kernel: ena0: free uncompleted tx mbuf
qid 0 idx 0x1f2
Oct 22 06:34:03 <kern.crit> apache-00 kernel: ena0: ena0: device is going UP
Oct 22 06:34:03 <kern.crit> apache-00 kernel: link is UP

Oct 22 13:18:10 <kern.crit> apache-00 kernel: ena0: The number of lost tx
completion is above the threshold (129 > 128). Reset the device
Oct 22 13:18:10 <kern.crit> apache-00 kernel: ena0: Trigger reset is on
Oct 22 13:18:10 <kern.crit> apache-00 kernel: ena0: device is going DOWN
Oct 22 13:18:16 <kern.crit> apache-00 kernel: ena0: free uncompleted tx mbuf
qid 4 idx 0x3a6
Oct 22 13:18:16 <kern.crit> apache-00 kernel: 
Oct 22 13:18:16 <kern.crit> apache-00 kernel: ena0: device is going UP
Oct 22 13:18:16 <kern.crit> apache-00 kernel: ena0: link is UP
Oct 22 13:18:47 <kern.crit> apache-00 kernel: nvme0: Missing interrupt
Oct 22 13:18:51 <kern.crit> apache-00 kernel: nvme5: Missing interrupt
Oct 22 13:18:51 <kern.crit> apache-00 kernel: nvme2: Missing interrupt
Oct 22 13:19:17 <kern.crit> apache-00 kernel: nvme0: Missing interrupt
Oct 22 13:20:47 <kern.crit> apache-00 kernel: nvme2: Missing interrupt
Oct 22 13:20:47 <kern.crit> apache-00 kernel: nvme5: Missing interrupt
Oct 22 13:20:47 <kern.crit> apache-00 kernel: nvme0: Missing interrupt
Oct 22 13:20:47 <kern.crit> apache-00 kernel: nvme5: Missing interrupt
Oct 22 13:20:47 <kern.crit> apache-00 kernel: nvme6: Missing interrupt
Oct 22 13:20:47 <kern.crit> apache-00 kernel: nvme2: Missing interrupt
Oct 22 13:20:47 <kern.crit> apache-00 kernel: nvme0: Missing interrupt
Oct 22 13:20:47 <kern.crit> apache-00 kernel: nvme5: Missing interrupt
Oct 22 13:20:47 <kern.crit> apache-00 kernel: nvme0: Missing interrupt
Oct 22 13:22:17 <kern.crit> apache-00 kernel: nvme4: Missing interrupt
Oct 22 13:22:17 <kern.crit> apache-00 kernel: nvme2: Missing interrupt
Oct 22 13:22:17 <kern.crit> apache-00 kernel: nvme5: Missing interrupt
Oct 22 13:22:17 <kern.crit> apache-00 kernel: nvme0: Missing interrupt
Oct 22 13:22:17 <kern.crit> apache-00 kernel: nvme5: nvme2: nvme4: Missing
interrupt
Oct 22 13:22:17 <kern.crit> apache-00 kernel: Missing interrupt
Oct 22 13:22:17 <kern.crit> apache-00 syslogd: last message repeated 1 times
Oct 22 13:22:17 <kern.crit> apache-00 kernel: nvme0: Missing interrupt
Oct 22 13:22:17 <kern.crit> apache-00 kernel: nvme2: Missing interrupt
Oct 22 13:22:17 <kern.crit> apache-00 kernel: nvme0: 
Oct 22 13:22:17 <kern.crit> apache-00 kernel: 
Oct 22 13:22:17 <kern.crit> apache-00 kernel: Missing interrupt
Oct 22 13:22:21 <kern.crit> apache-00 kernel: nvme2: nvme6: Missing interrupt
Oct 22 13:22:21 <kern.crit> apache-00 kernel: nvme5: Missing interrupt
Oct 22 13:22:21 <kern.crit> apache-00 kernel: Missing interrupt
Oct 22 13:22:21 <kern.crit> apache-00 kernel: nvme4: Missing interrupt
Oct 22 13:22:51 <kern.crit> apache-00 kernel: nvme6: nvme4: Missing interrupt
Oct 22 13:22:51 <kern.crit> apache-00 kernel: Missing interrupt
Oct 22 13:22:51 <kern.crit> apache-00 kernel: nvme5: Missing interrupt
Oct 22 13:22:51 <kern.crit> apache-00 kernel: nvme2: Missing interrupt
Oct 22 13:23:16 <kern.crit> apache-00 kernel: nvme0: Missing interrupt
Oct 22 13:23:21 <kern.crit> apache-00 kernel: nvme2: Missing interrupt
Oct 22 13:23:21 <kern.crit> apache-00 kernel: nvme6: Missing interrupt
Oct 22 13:23:26 <kern.crit> apache-00 kernel: nvme4: Missing interrupt

I will add that this instance was originally a FreeBSD 11.x system that was
freebsd-update'd to 12. As a 11 system it was panicing on the transfer every
3-4 hours. I am bringing up a fresh 12.x system to do additional testing.

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-virtualization mailing list