From nobody Mon Nov 29 21:19:59 2021 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 6722618B2789 for ; Mon, 29 Nov 2021 21:20:04 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-io1-xd2c.google.com (mail-io1-xd2c.google.com [IPv6:2607:f8b0:4864:20::d2c]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4J2ytC31glz4khJ for ; Mon, 29 Nov 2021 21:20:03 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-io1-xd2c.google.com with SMTP id 14so23286531ioe.2 for ; Mon, 29 Nov 2021 13:20:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=0DClOK/SU9dbYuTVVD0V6yII2Om+InwpA+f9+cJzkMQ=; b=PQmQMwbiXohith+ZA0yF92lMi403a4eYFeiE21Mx++vFaOu7HyK5af1dWgzL2So4j5 rmt24+UvhSmTGqfpJoIW89Y8EleWLWwBj6Upq/kgLhaMEtTL2TLHlufXlBdZzdB9nw9x dG3SgAOkjhhcYjZqZd+PhitOqpMrF92Ho8fWzyJKJGqI11hOuv3IWprO79pH449MoDV4 HuQ0eewWG040pW8UYKLwnbtkMyRknNPJjT1f+Cn2dTsxvFFYf/IQQHh2bMkLLFIX/uqp nL2CWcbjAQooMS4qSXvIcZeLevMtma8XZ+0F4foCnWQqFzBMsWQQxtodFDgIf3qYHQQQ m/8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition :content-transfer-encoding:in-reply-to; bh=0DClOK/SU9dbYuTVVD0V6yII2Om+InwpA+f9+cJzkMQ=; b=3Ghb0V6z18ktnysB41YEJdZFuXtoq3PrOHsoscT1hIQ8SwLB/Qs7GaPKcay4mWXGR4 64aPKqWOOj4bx4n+yeR0X/Ed4GojV7hopujzl9Egbb198j+2MJ6H1/T7JMw3o+UuO5kr Cg3zNOlFkLXuYxUN71BZVDDXpLIjrb9TdraYOtRMOfzJDP6Iu10fHGmYEQKkhZBA23G7 qpsXzNsUwZaZi9R8DhX9dhnYmyff3JpZbBqnXgdVlHiKClCcjn6tdlXnqvpTXIiv0CDA g1S+ypEijtXwGhhRZ6+HuPrHbGgNIF0zZAcWuFLzVQMf5sTdN98HA/L7hfS7ddDlHyKh R04Q== X-Gm-Message-State: AOAM532UwJqTlG86Sd8HFdzY4OWTXvcqmNgVPwwOpbm6/MP7n042PS9U oTwYGMMtPJBhO6oX6feTTsi3i3lNRIw= X-Google-Smtp-Source: ABdhPJxH5fIgFGm3BgR7iYoqFMC+1qXkmuEWoE8LL6HRR0/CxiruAsNWodz3iJS+kjW/4g8JsjjuRg== X-Received: by 2002:a05:6602:2c85:: with SMTP id i5mr57338734iow.89.1638220802864; Mon, 29 Nov 2021 13:20:02 -0800 (PST) Received: from nuc ([142.126.186.191]) by smtp.gmail.com with ESMTPSA id k19sm5993613ilr.47.2021.11.29.13.20.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 29 Nov 2021 13:20:02 -0800 (PST) Date: Mon, 29 Nov 2021 16:19:59 -0500 From: Mark Johnston To: "David E. Cross" Cc: freebsd-hackers@freebsd.org Subject: Re: bhyve -D not cleaning up after itself Message-ID: References: List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspamd-Queue-Id: 4J2ytC31glz4khJ X-Spamd-Bar: + Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20210112 header.b=PQmQMwbi; dmarc=none; spf=pass (mx1.freebsd.org: domain of markjdb@gmail.com designates 2607:f8b0:4864:20::d2c as permitted sender) smtp.mailfrom=markjdb@gmail.com X-Spamd-Result: default: False [1.38 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20210112]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-hackers@freebsd.org]; DMARC_NA(0.00)[freebsd.org]; NEURAL_SPAM_MEDIUM(1.00)[1.000]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::d2c:from]; NEURAL_HAM_SHORT(-0.82)[-0.822]; NEURAL_SPAM_LONG(0.90)[0.903]; MID_RHS_NOT_FQDN(0.50)[]; FORGED_SENDER(0.30)[markj@freebsd.org,markjdb@gmail.com]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; FROM_NEQ_ENVFROM(0.00)[markj@freebsd.org,markjdb@gmail.com]; RCVD_TLS_ALL(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim] X-ThisMailContainsUnwantedMimeParts: N On Sat, Nov 27, 2021 at 02:40:57AM -0500, David E. Cross wrote: > I have noticed for awhile that bhyve -D doesn't seem to actually do what > is claimed  (to destroy a VM on guest initiated power-off).  This > evening I decided to ktrace it to see if I was just not getting > something about how this was supposed to work, and found: > > >  68613 vcpu 0   CALL > __sysctlbyname(0x1ebcdb20a133,0xe,0,0,0x1ebce4ba60f0,0x9) >  68613 vcpu 0   SCTL "hw.vmm.destroy" >  68613 vcpu 0   RET   __sysctlbyname -1 errno 1 Operation not permitted >  68613 vcpu 0   CALL  exit(0x1) > > > Reading quickly the kernel code for vm_destroy(), I find 2 candidates: > > static int > vmm_priv_check(struct ucred *ucred) > { > >         if (jailed(ucred) && >             !(ucred->cr_prison->pr_allow & pr_allow_flag)) >                 return (EPERM); > >         return (0); > } > > This doesn't seem to be it, my process is not jailed. > > That leads to the only other (I think) call in sysctl_vmm_destroy that > could return EPERM: > > error = sysctl_handle_string(oidp, buf, buflen, req); > > > But I am just not seeing it.  Also this EXACT same call works from the > context of bhyvectl --vm=FOO --destroy, run from the same shell as the > bhyve process that just terminated.  Is the 'ctx' somehow incorrect in > bhyve?  I is used earlier in that function, so I am assuming it is right? The problem is that bhyve runs in capability mode (see capiscum(4)), which restricts access to the sysctl namespace. In particular, most sysctls are not accessible, including hw.vmm.destroy, so -D is effectively broken. One possible solution is to spawn an unsandboxed helper process which can toggle the sysctl on bhyve's behalf. That is a rather heavyweight solution, though. Earlier this year some work was done on using a file descriptor-based interface to create and destroy VMs, moving away from the old sysctl-based interface. It's stalled at the moment but I hope to return to that work quite soon. That should also help fix the problem but will take some time to complete. I think it may be easiest to simply allow writes to the sysctl for the time being: https://reviews.freebsd.org/D33169