Re: ZFS + FreeBSD XEN dom0 panic
- Reply: Roger Pau Monné : "Re: ZFS + FreeBSD XEN dom0 panic"
- In reply to: Brian Buhrow : "Re: ZFS + FreeBSD XEN dom0 panic"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Wed, 02 Mar 2022 08:57:37 UTC
Hello, I started using XEN on one pre-production machine (with aim to use later in production) with 12.2, but since it experienced random crashes i updated to 13.0 in hope that errors might disappear. I do not know how detailed should i write, so that this email is not too long, but gives enough info. FreeBSD Dom0 is installed on ZFS, somewhat basic install, IPFW and rules for NATting are used. Zpool is composed of 2 mirrored disks. There is a ZVOL volmode=dev for each VM and VM's jail that are attached as raw devices to DomU. At the moment DomUs contain FreeBSD, some 12.0 to 13.0, UFS, with VNET jails, epairs all bridged to DomU's xn0 interface. On Dom0 i have bridge interfaces, where DomU's are connected depending on their "zone/network", those that have allowed outgoing connections are NATted by IPFW on specific physical NIC and IP. xen_cmdline="dom0_mem=6144M cpufreq=dom0-kernel dom0_max_vcpus=4 dom0=pvh console=vga,com1 com1=115200,8n1 guest_loglvl=all loglvl=all" Physical hardware is XEON CPU, ECC RAM 16G, 2x8TB HDD. DomU config, something like this: memory = 1024 vcpus=2 name = "sys-01" type = "hvm" boot = "dc" vif = [ 'vifname=xbr0p5,type=vif,mac=00:16:3E:01:63:05,bridge=xbr0' ] disk = [ 'backendtype=phy, format=raw, vdev=xvda, target=/dev/zvol/sys/vmdk/root/sys-01-root', 'backendtype=phy, format=raw, vdev=xvdb, target=/dev/zvol/sys/vmdk/root/sys-01-jail1', 'backendtype=phy, format=raw, vdev=xvdc, target=/dev/zvol/sys/vmdk/root/sys-01-jail2' .. more defs, if any .. ] vnc=1 vnclisten="0.0.0.0:X" usbdevice = "tablet" serial = "pty" When just started, overall system works, speeds are acceptable, load is not high so system is not under stress. The thing is that at some unexpected times i noticed that system reboots, i.e. when i create new ZFS volume in Dom0, or when i reboot DomU or do something in Dom0 which seems unrelated, sometimes it was that init 0 would reboot system, sometimes it shut it down. It somehow felt, that panics happen when there is HDD load. So i got somewhat similar machine for testing/lab env, 16G ECC, slower XEON, 2x2TB HDD and serial port and started to try to push that system to limits with various combinations, restricting RAM, CPUs, etc. The bug info contains combination, that seemed for me to be the fastest way of how to panic system. For XEN startup "vifname=" did not work as described in XEN user manual pages for default startup script, so i added "ifconfig name $vifname" in that script. The necessity for it was, that ipfw rules that required "via $ifname in", had to have specific NIC, but XEN by default each time was creating new NIC name depending on which name was free. This is not active on lab system, and it still crashes, so i do not think that problem cause is this. About history. I believe hardware is okay, since before XEN i was using FreeBSD 12.2 (upgraded incrementally from 12.0), ZFS + jails a lot, VNETs used were netgraph(VNET bridge and ethernet interfaces). What i loved about that setup was, clean output of ifconfig, since host had only bridge interface and virtual ethernet interfaces for jails came directly from that bridge. New jail creation was just "zfs clone", it did not take much space, snapshots for backups could be made, whole HDD space could be easily expanded/limited for each jail, due to ZFS capabilities. System was stable. The problem with that setup was, that if some jail started to misbehave badly it was hard to control overall system performance and behavioral characteristics, i tried rctl, but jails could misbehave in new unexpected bad ways (exhausting RAM, process count, CPU load, HDD load, opening too many network sockets, etc. If OOM killer started to kill processes, it was impossible to control which process/jail should get killed first, which should be kept), so for me it seemed that virtualization is better way to solve that. I.e. to have a system VM, that has DNS, Web gateway, etc., and lower priority VMs, that could crash if misbehaving. I like XEN architecture in general, and i would like to use FreeBSD as Dom0, if possible; due to ZFS, knowledge and good history of OS stability. Since ZFS dataset can not be passed through to DomU, my idea was to use ZVOLs and UFS within VM, then i could snapshot those ZVOLs for backups, DomU could growfs when necessary. Somewhat less convenient as for jail architecture, but still, good enough. My first attempt was to keep netgraph jails in Dom0, but it turned out bad. Almost every time system panic happened when jail was started/stopped. Not first jail, but 5th+, panic-ed system with high probability. So i started to use epairs instead. It was less unstable, but still crashed from time to time. Now there are no jails, and still. I tried different ideas, to pass through whole HDD as raw in DomU-iscsi and use ctld on Dom0 to provide disks for other DomUs, HDD speed was bad, but system still crashed, i tried raw files on ZFS datasets, speeds seemed close to ZVOLs actually, but system still crashed. So now i was starting to wonder, what configurations do people use successfully? What have i missed? On Tue, Mar 1, 2022 at 5:40 PM Brian Buhrow <buhrow@nfbcal.org> wrote: > hello. I've been running FreeBSD-12.1 and Freebsd-12.2 plus ZFS > plus Xen with FreeBSD as > dom0 without any stability issues for about 2 years now. I'm doing this > on a number of > systems, with a variety of NetBSD, FreeBSD and Linux as domU guests. I > haven't looked at your > bug details, but are you running FreeBSD-13? > -thanks > -Brian > >