[Bug 219355] Heavy disk activity in bhyve deadlocks host

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Wed May 17 13:48:02 UTC 2017


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219355

            Bug ID: 219355
           Summary: Heavy disk activity in bhyve deadlocks host
           Product: Base System
           Version: 11.0-STABLE
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: freebsd-bugs at FreeBSD.org
          Reporter: freebsd-bugs at joe.mulloy.me

Hello,

I have a somewhat complicated server setup that I am trying to run bhyve on. I
did not have this problem on FreeBSD 10, although it may have been due to my
zpool not being as full. My main os pool is a small unencrypted mirror. My
jails and bhyve VMs are stored on a separate pool named "data" that is made up
of 4x 3TB WD Red drives in a pair of mirrors (RAID 10) with each disk being
encrypted via geli and the geli device being passed to ZFS. I am using chyves
to manage my bhyve VMs. I have found that doing heavy disk activity will
reliably deadlock the host system and I will then need to reboot/reset it. The
VM disks are stored in zvols. So far I have been able to trigger this condition
by attempting to install Windows 7, where the host crashes when Windows starts
copying files. I have also managed to trigger it a couple times by trying to
assemble a jail by hand in a FreeBSD guest by copying a template directory
holding a FreeBSD installation. The host system has 32GB of RAM and I only give
2GB to the VM so I should have plenty of memory. Below is the output of top
during one of the deadlocks when trying to install Windows. The state of the
bhyve process is kqread. I was able to successfully install Windows 7 by
storing the guest on a separate geli encrypted pool. At one point my pool was
85% full. I cleaned it up to be only 50% full but I'm still having this issue.
I think I have somehow got my pool in a state where it's going to keep having
this problem. I would like to fix it by recreating my pool but I would like to
debug it first in case there is some bug that can be fixed. I don't know how to
debug this further so if someone could provide me with some instructions or
commands to debug this further I would appreciate it.

root at server1:~ # chyves win7 get all
Checking for newer version of chyves on the master branch from
https://github.com/chyves/chyves/raw/master/sbin/chyves.
Setting global property 'check_for_updates_last_check' to value: '20170517'
On current version, will check again on: 20170524
Setting global property 'check_for_updates_last_check_status' to value: '0'
Getting all win7's properties...
bargs                                -H -P -S
bhyve_disk_type                      ahci-hd
bhyve_net_type                       virtio-net
bhyveload_flags
chyves_guest_version                 0300
cpu                                  1
creation                             Created on Mon Oct 10 22:06:33 UTC 2016 by
chyves v0.2.0 2016/09/11 using __create()
description                          -
eject_iso_on_n_reboot                2
loader                               uefi
net_ifaces                           tap55
notes                                -
os                                   default
ram                                  2G
rcboot                               0
revert_to_snapshot_method            off
revert_to_snapshot
serial                               nmdm55
template                             no
uefi_console_output                  vnc
uefi_firmware                        BHYVE_UEFI_20160704_1.fd
uefi_vnc_ip                          10.2.4.50
uefi_vnc_mouse_type                  ps2
uefi_vnc_pause_until_client_connect  yes
uefi_vnc_port                        5900
uefi_vnc_res                         1024x768
uuid                                 cce87028-8f35-11e6-86a3-94de80a12470

chyves version: chyves v0.2.0 2016/09/11

root at server1:~ # uname -a
FreeBSD server1.jdmulloy.net 11.0-RELEASE-p9 FreeBSD 11.0-RELEASE-p9 #0: Tue
Apr 11 08:48:40 UTC 2017    
root at amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64

483 processes: 2 running, 451 sleeping, 30 waiting
CPU: 16.3% user,  0.0% nice, 83.7% system,  0.0% interrupt,  0.0% idle
Mem: 2357M Active, 864M Inact, 28G Wired, 4520K Free 
ARC: 4978M Total, 3836K MFU, 115M MRU, 2456M Anon, 1602M Header, 800M Other
Swap: 16G Total, 257M Used, 16G Free, 1% Inuse

  PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
10662 root         15  52    0 45776K 17136K uwait   2   2:11  72.10% filebeat
10136 root         12  20    0 43344K 16612K uwait   2   3:38  51.58% filebeat
11208 root         12  20    0 43344K 16840K tx->tx  2   2:25  35.52% filebeat
 5373 root         14  20    0 43600K 14824K pfault  1   1:51  27.82% filebeat
  840 root         11  20    0 39120K 11004K pfault  0   0:39  12.06% filebeat
  792 zabbix        1  20    0 30596K  4204K RUN     3   0:32  12.00%
zabbix_agentd
13401 root         21  20    0  2127M  1963M kqread  1   3:18   3.23% bhyve
 8940 root          1  20    0 26264K  5000K CPU3    3   0:01   0.19% top
11613 root         11  47    0 41100K 14368K uwait   3   0:00   0.12% filebeat
 7411 root         14  20    0 45100K 17612K tx->tx  3   0:00   0.09% filebeat
 6504 root         10  52    0 39984K 13992K uwait   3   0:01   0.09% filebeat
 1125 root          1  20    0 22004K  3692K select  3   0:02   0.06% tmux
 1105 jdmulloy2     1  20    0 46760K  5688K select  3   0:01   0.06%
mosh-server

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-bugs mailing list