Some kind of race condition in adding and removing domu's causes vm zombies

From: Brian Buhrow <buhrow_at_nfbcal.org>
Date: Fri, 24 Jun 2022 01:30:56 UTC
	hello.  I don't have a lot more details on the issue, but under xen-4.15 and xen-4.16 with
freeBSD-12 and FreeBSD-13, it's pretty easy to end up with zombie domu's that are unkillable
and unrestartable. Even worse, the block devices associated with these not-quite-gone domus'
are unusable with other domu's without an entire system reboot.

	How to reproduce:

1.  Shutdown a vm that's currently running, I'm using NetBSD, but FreeBSD domus' wil
demonstrate this behavior as well.


2.  If auto-restart is set in the domu's conf file, the domu will restart with a new domain id.

3.  Just as the newly restarted domu is coming up, issue:
xl destroy <domid-of-newly-started-domain>

You may see output like the following:

root# xl destroy 20
libxl: error: libxl_device.c:1111:device_backend_callback: Domain 20:unable to remove device
with pa
th /local/domain/0/backend/vbd/20/768
libxl: error: libxl_device.c:1111:device_backend_callback: Domain 20:unable to remove device
with pa
th /local/domain/0/backend/vif/20/0
libxl: error: libxl_domain.c:1530:devices_destroy_cb: Domain 20:libxl__devices_destroy failed

Now, issue:
#xl list
(null)                                      20     0     1     --p--d    2083.7

The work around I've found for this issue is to shutdown the domu with the -h flag, causing the
system to wait for a final keypress on the console before rebooting.  Then, while it's waiting,
issue the xl destroy command on the old, waiting, domain ID.

this work around will prevent the issue, but it's my view that I shouldn't be able to wedge the
destruction process in this way such that the entire machine needs to be restarted.  Being able
to do this makes the system rather fragile.

-thanks
-Brian