[Bug 225791] ena driver causing kernel panics on AWS EC2

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Tue May 22 22:25:23 UTC 2018


Terje Elde <terje at elde.net> changed:

           What    |Removed                     |Added
                 CC|                            |terje at elde.net

--- Comment #1 from Terje Elde <terje at elde.net> ---
We're also affected by this, running c5.large, handling about 13 000
connections through haproxy, then varnish and on to other systems.  Activity
was about 4000 requests pr. minute leading up to the crash, which doesn't seem
all that high.  It's possible that it could have spiked shortly before the
crash though, without getting that in the logs.

This is:
FreeBSD [host snipped] 11.1-RELEASE-p8 FreeBSD 11.1-RELEASE-p8 #0: Tue Mar 13
17:07:05 UTC 2018    
root at amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64

It's a lightly modified/configured version of one of the usual FreeBSD AMIs, I
don't recall the AMI ID exactly, sorry.  Kernel etc is stock, we've just made
additions in terms of software etc for our own AMI.

We have two virtually identical machines exposed under the same hostname,
receiving a near identical load, and have so far only been noticing this with
one of the machines.  Could be coincidental, but figured it worthwhile to

It strikes me as noteworthy that the data rate was only about 700kBps at the
last data point I have before the crash.  Unfortunately I don't know anything
about packet rate, and again it's possible that there could have been a peak
leading up to the crash, without getting the logs of it.

If anyone is interested in any other data from this, please do let me know. 
Also, this is part of a redundant setup, allowing some extra room for moving
things around if anyone wants anything tested or tried on the setup.

>> Crash itself:

Limiting open port RST response from 457 to 200 packets/sec
Limiting open port RST response from 487 to 200 packets/sec
Limiting open port RST response from 541 to 200 packets/sec
Limiting open port RST response from 517 to 200 packets/sec
Limiting open port RST response from 586 to 200 packets/sec
Limiting open port RST response from 237 to 200 packets/sec
ena0: Found a Tx that wasn't completed on time, qid 1, index 324.
pid 3639 (varnishd), uid 429: exited on signal 6
Limiting open port RST response from 259 to 200 packets/sec
Limiting open port RST response from 380 to 200 packets/sec
ena0: Found a Tx that wasn't completed on time, qid 1, index 181.

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x1c
fault code              = supervisor write data, page not present
instruction pointer     = 0x20:0xffffffff82173f8c
stack pointer           = 0x28:0xfffffe0110f43180
frame pointer           = 0x28:0xfffffe0110f43260
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 12 (irq261: ena0)
trap number             = 12
panic: page fault
cpuid = 0
KDB: stack backtrace:
#0 0xffffffff80aadac7 at kdb_backtrace+0x67
#1 0xffffffff80a6bba6 at vpanic+0x186
#2 0xffffffff80a6ba13 at panic+0x43
#3 0xffffffff80ee3092 at trap_fatal+0x322
#4 0xffffffff80ee30eb at trap_pfault+0x4b
#5 0xffffffff80ee290a at trap+0x2ca
#6 0xffffffff80ec3d40 at calltrap+0x8
#7 0xffffffff80a321ec at intr_event_execute_handlers+0xec
#8 0xffffffff80a324d6 at ithread_loop+0xd6
#9 0xffffffff80a2f845 at fork_exit+0x85
#10 0xffffffff80ec4a0e at fork_trampoline+0xe
Uptime: 8d22h59m55s

>> boot log:

Copyright (c) 1992-2017 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 11.1-RELEASE-p8 #0: Tue Mar 13 17:07:05 UTC 2018
    root at amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64
FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on LLVM
VT(vga): text 80x25
CPU: HammerEM64T (3000.05-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x50653  Family=0x6  Model=0x55  Stepping=3
  AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
  AMD Features2=0x121<LAHF,ABM,Prefetch>
  Structured Extended
  Structured Extended Features2=0x8<PKU>
  TSC: P-state invariant, performance statistics
Hypervisor: Origin = "KVMKVMKVM"
real memory  = 5114953728 (4878 MB)
avail memory = 3844890624 (3666 MB)
Event timer "LAPIC" quality 600
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 1 package(s) x 1 core(s) x 2 hardware threads
random: unblocking device.
ioapic0 <Version 1.1> irqs 0-23 on motherboard
SMP: AP CPU #1 Launched!
random: entropy device external interface
kbd1 at kbdmux0
netmap: loaded module
module_register_init: MOD_LOAD (vesa, 0xffffffff80f5eb40, 0) error 19
random: registering fast source Intel Secure Key RNG
random: fast provider: "Intel Secure Key RNG"
vtvga0: <VT VGA driver> on motherboard
cryptosoft0: <software crypto> on motherboard
acpi0: <AMAZON AMZNRSDT> on motherboard
acpi0: Power Button (fixed)
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
atrtc0: <AT realtime clock> port 0x70-0x71,0x72-0x77 irq 8 on acpi0
Event timer "RTC" frequency 32768 Hz quality 0
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <24-bit timer at 3.579545MHz> port 0xb008-0xb00b on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
isab0: <PCI-ISA bridge> at device 1.0 on pci0
isa0: <ISA bus> on isab0
pci0: <old, non-VGA display device> at device 1.3 (no driver attached)
vgapci0: <VGA-compatible display> mem 0xfe400000-0xfe7fffff at device 3.0 on
vgapci0: Boot video device
nvme0: <Generic NVMe Device> mem 0xfebf0000-0xfebf3fff irq 11 at device 4.0 on
ena0: <ENA adapter> mem 0xfebf4000-0xfebf7fff at device 5.0 on pci0
ena0: Elastic Network Adapter (ENA)ena v0.7.0
ena0: initalize 2 io queues
ena0: Ethernet address: 02:2b:3a:f4:70:8c
ena0: Allocated msix_entries, vectors (cnt: 3)
nvme1: <Generic NVMe Device> mem 0xfebf8000-0xfebfbfff irq 11 at device 31.0 on
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
uart0: <Non-standard ns8250 class UART with FIFOs> port 0x3f8-0x3ff irq 4 flags
0x10 on acpi0
uart0: console (115200,n,8,1)
orm0: <ISA Option ROM> at iomem 0xef000-0xeffff on isa0
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
attimer0: <AT timer> at port 0x40 on isa0
Timecounter "i8254" frequency 1193182 Hz quality 0
attimer0: Can't map interrupt.
ppc0: cannot reserve I/O port range
ena0: link is UP
ena0: link state changed to UP
Timecounters tick every 1.000 msec
usb_needs_explore_all: no devclass
nvme cam probe device init
nvme0: temperature threshold not supported
nvd0: <Amazon Elastic Block Store> NVMe namespace
nvd0: 20480MB (41943040 512 byte sectors)
nvme1: temperature threshold not supported
nvd1: <Amazon Elastic Block Store> NVMe namespace
GEOM: nvd1: corrupt or invalid GPT detected.
nvd1: 20480MB (41943040 512 byte sectors)
GEOM: nvd1: GPT rejected -- may not be recoverable.
Trying to mount root from ufs:/dev/gpt/rootfs [rw]...

You are receiving this mail because:
You are the assignee for the bug.

More information about the freebsd-virtualization mailing list