(bhyve) Debian vm crashing with kernel panic

Kári Hreinsson karihre at gmail.com
Thu Sep 7 04:30:29 UTC 2017


Dear all,

I have been experiencing random linux kernel panics on a Debian
virtual machine running under bhyve on FreeBSD 11.1, and believe it
may be related to the virtualization environment. I am not an advanced
FreeBSD user by any means, which is why I am turning to this mailing
list for possible answers, realizing that I could be making some
simple errors. I have two similar (same version and kernel) Debian VMs
running on the FreeBSD host, one of them lightly loaded and running
without any issues, the other one more heavily loaded and experiencing
kernel panics a few days after booting.

CPU: Intel(R) Xeon(R) CPU E3-1275 v6
Host system: 11.1-RELEASE-p1
VM: Debian 9 (Stretch), kernel 4.9.0-3-amd64 #1 SMP Debian 4.9.30-2+deb9u3

On the FreeBSD side of things I find nothing in any logs under
/var/log indicating any problem (perhaps I am not looking in the right
places?). On the Debian side of things an open ssh session got plenty
of these leading up to the crash:
   kernel:[489300.648296] NMI watchdog: BUG: soft lockup - CPU#0 stuck
for 22s! [kworker/0:0:14902]

Debian kern.log file contains this just before the crash:
Sep  6 10:23:59 hostname kernel: [488456.219948] INFO: rcu_sched
self-detected stall on CPU
Sep  6 10:23:59 hostname kernel: [488456.220007]        0-...: (5249
ticks this GP) idle=b45/140000000000001/0 softirq=27802459/27802459
fqs=2423
Sep  6 10:23:59 hostname kernel: [488456.220062]         (t=5250
jiffies g=10449032 c=10449031 q=319)
Sep  6 10:23:59 hostname kernel: [488456.220093] Task dump for CPU 0:
Sep  6 10:23:59 hostname kernel: [488456.220094] kworker/0:0     R
running task        0 14902      2 0x00000008
Sep  6 10:23:59 hostname kernel: [488456.220108] Workqueue: rpciod
rpc_async_schedule [sunrpc]
Sep  6 10:23:59 hostname kernel: [488456.220109]  ffffffff90713580
ffffffff8faa3bcb 0000000000000000 ffffffff90713580
Sep  6 10:23:59 hostname kernel: [488456.220111]  ffffffff8fb7a4b6
ffff8a0bffc18fc0 ffffffff9064a6c0 0000000000000000
Sep  6 10:23:59 hostname kernel: [488456.220112]  ffffffff90713580
00000000ffffffff ffffffff8fadee04 0000000000e746a9
Sep  6 10:23:59 hostname kernel: [488456.220113] Call Trace:
Sep  6 10:23:59 hostname kernel: [488456.220114]  <IRQ>
Sep  6 10:23:59 hostname kernel: [488456.220116]  [<ffffffff8faa3bcb>]
? sched_show_task+0xcb/0x130
Sep  6 10:23:59 hostname kernel: [488456.220118]  [<ffffffff8fb7a4b6>]
? rcu_dump_cpu_stacks+0x92/0xb2
Sep  6 10:23:59 hostname kernel: [488456.220119]  [<ffffffff8fadee04>]
? rcu_check_callbacks+0x754/0x8a0
Sep  6 10:23:59 hostname kernel: [488456.220121]  [<ffffffff8faed0c3>]
? update_wall_time+0x473/0x790
Sep  6 10:23:59 hostname kernel: [488456.220122]  [<ffffffff8faf48c0>]
? tick_sched_handle.isra.12+0x50/0x50
Sep  6 10:23:59 hostname kernel: [488456.220124]  [<ffffffff8fae5718>]
? update_process_times+0x28/0x50
Sep  6 10:23:59 hostname kernel: [488456.220125]  [<ffffffff8faf4890>]
? tick_sched_handle.isra.12+0x20/0x50
Sep  6 10:23:59 hostname kernel: [488456.220125]  [<ffffffff8faf48f8>]
? tick_sched_timer+0x38/0x70
Sep  6 10:23:59 hostname kernel: [488456.220126]  [<ffffffff8fae60fc>]
? __hrtimer_run_queues+0xdc/0x240
Sep  6 10:23:59 hostname kernel: [488456.220127]  [<ffffffff8fae67cc>]
? hrtimer_interrupt+0x9c/0x1a0
Sep  6 10:23:59 hostname kernel: [488456.220128]  [<ffffffff90008ba9>]
? smp_apic_timer_interrupt+0x39/0x50
Sep  6 10:23:59 hostname kernel: [488456.220129]  [<ffffffff90007ec2>]
? apic_timer_interrupt+0x82/0x90
Sep  6 10:23:59 hostname kernel: [488456.220130]  <EOI>
Sep  6 10:23:59 hostname kernel: [488456.220131]  [<ffffffff8fac0e11>]
? native_queued_spin_lock_slowpath+0x21/0x190
Sep  6 10:23:59 hostname kernel: [488456.220132]  [<ffffffff9000613d>]
? _raw_spin_lock+0x1d/0x20
Sep  6 10:23:59 hostname kernel: [488456.220141]  [<ffffffffc047e87a>]
? nfs4_close_done+0xfa/0x400 [nfsv4]
Sep  6 10:23:59 hostname kernel: [488456.220145]  [<ffffffffc0493280>]
? nfs4_xdr_dec_open_downgrade+0xf0/0xf0 [nfsv4]
Sep  6 10:23:59 hostname kernel: [488456.220151]  [<ffffffffc02fb5f0>]
? __rpc_sleep_on_priority+0x340/0x340 [sunrpc]
Sep  6 10:23:59 hostname kernel: [488456.220155]  [<ffffffffc02fb5f0>]
? __rpc_sleep_on_priority+0x340/0x340 [sunrpc]
Sep  6 10:23:59 hostname kernel: [488456.220159]  [<ffffffffc02fb61a>]
? rpc_exit_task+0x2a/0x90 [sunrpc]
Sep  6 10:23:59 hostname kernel: [488456.220163]  [<ffffffffc02fbf86>]
? __rpc_execute+0x86/0x420 [sunrpc]
Sep  6 10:23:59 hostname kernel: [488456.220164]  [<ffffffff8fa90384>]
? process_one_work+0x184/0x410
Sep  6 10:23:59 hostname kernel: [488456.220165]  [<ffffffff8fa9065d>]
? worker_thread+0x4d/0x480
Sep  6 10:23:59 hostname kernel: [488456.220166]  [<ffffffff8fa90610>]
? process_one_work+0x410/0x410
Sep  6 10:23:59 hostname kernel: [488456.220167]  [<ffffffff8fa7bb0a>]
? do_group_exit+0x3a/0xa0
Sep  6 10:23:59 hostname kernel: [488456.220168]  [<ffffffff8fa965d7>]
? kthread+0xd7/0xf0
Sep  6 10:23:59 hostname kernel: [488456.220169]  [<ffffffff8fa96500>]
? kthread_park+0x60/0x60
Sep  6 10:23:59 hostname kernel: [488456.220170]  [<ffffffff900064f5>]
? ret_from_fork+0x25/0x30

This seems to be all I have to go on. This is the first panic I
experience after upgrading to 11.1, in the past I was experiencing
similar panics on 11.0 but the log file output from those seemed
different as the kernel spat out hundreds of errors in the hours
leading up to finally crashing. I'm not sure those are relevant as I
was running 11.0 and didn't see the same (but similar) errors this
time around, but I can attach that log file if anyone is interested.

The vm startup command is:
  bhyve -AHP \
    -s 0:0,hostbridge \
    -s 1:0,lpc \
    -s 2:0,virtio-net,tap0 \
    -s 3:0,virtio-net,tap1 \
    -s 4:0,virtio-blk,/dev/zvol/tank/vms/hostname-root \
    -s 5:0,virtio-blk,/dev/zvol/tank/vms/hostname-scratch \
    -s 6:0,virtio-blk,/dev/zvol/tank/vms/hostname-temp \
    -s 29,fbuf,tcp=127.0.0.1:5900,w=800,h=600 \
    -l com1,/dev/nmdm0A \
    -l bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI.fd \
    -c 2 \
    -m 32G hostname

Anything that could shed some light on this issue would be much
appreciated. If I can provide any additional information please let me
know.

Thank you,
Kari Hreinsson


More information about the freebsd-virtualization mailing list