From nobody Wed Feb 08 19:24:09 2023 X-Original-To: freebsd-virtualization@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4PBqgK6h6dz3nqxj for ; Wed, 8 Feb 2023 19:24:13 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-io1-xd2e.google.com (mail-io1-xd2e.google.com [IPv6:2607:f8b0:4864:20::d2e]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4PBqgK4rV2z4cSy; Wed, 8 Feb 2023 19:24:13 +0000 (UTC) (envelope-from markjdb@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-io1-xd2e.google.com with SMTP id j17so5474682ioa.9; Wed, 08 Feb 2023 11:24:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:sender :from:to:cc:subject:date:message-id:reply-to; bh=FjznaNAOs1m0R2+C6A+qSCozFue1eFJXbRJmQHma1Pk=; b=cl7XYtZQwoz96V0deQfuDkrvWgbab0mrIV3BKNomJCLHadzQPGe+Kv6GdxqjWePoky c2s3yWezHMGscSUmK4vCS+MdQB0BB+wbKJkSTm3EUhkzV0Vvd5sj/wfWHDGeYaxAbXrR Ov7yS3X9BymkMbBiRiJX8tc7DjdyASdMi4bGk+0kka0UJ6chk5L2ZwcT+PjiRuUWr8Dz eGEOWXGpux6CV9LRyBp5CcE/MN9Fmu8OQek/uuKgA8v6fmTbsMgKPXcJIzeev0glWgaB XTujM8+i2QDRsB6rYnKirDZomPGO0xcDrZk4l1SomawbOSVua2oUG5r2Cfs0VxF5doTt F8TQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:sender :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=FjznaNAOs1m0R2+C6A+qSCozFue1eFJXbRJmQHma1Pk=; b=OJeBVu3HgVWG6COYIUsdXya1sZdg6A4GptN10CbMSGtaCBU58Jx1fAcX4mMngcue+o aOCMfuT4XqxKbfjQraZ8MPoqTlMWWj1WMTjuhIiMoANERuqBnHXkx3qAaNiNS1atvYoC /lQKtn5H/8XTE1sCm2k4tJpNgCjc8zEa+FmwCDO6vgMYaaGSPouaaPP3V9ThXVuRcap2 NKNO9YfQKqsGZPUA+CRWU5064uQsBp5mzNuqmkq+4yDAfpd/hgRQQ1sPBDRXfG5Zjz/T 016F8MFiCgFKWGtwrDkMHRDWWERZPOA0pMB2k5lOLCZoMEQwgb6101rAjTU/qkAMn8/t bDgg== X-Gm-Message-State: AO0yUKVjKOIR5xifTBrINegc3xSWUvsCli83sIxQxNY/VKRR0T8aeRHd df2pTOhJNtLOfyD27yRbbbmivT7vhQg= X-Google-Smtp-Source: AK7set9Jl26i0jiJgB4KTuxmFeFcPYD76xkWLIklhxVTFLdqDiel5z4A2SqNsCHferQ6l9rCRmOk6A== X-Received: by 2002:a5e:8914:0:b0:734:e4df:cd04 with SMTP id k20-20020a5e8914000000b00734e4dfcd04mr5360322ioj.12.1675884251973; Wed, 08 Feb 2023 11:24:11 -0800 (PST) Received: from nuc (192-0-220-237.cpe.teksavvy.com. [192.0.220.237]) by smtp.gmail.com with ESMTPSA id e4-20020a6b5004000000b006884b050a0asm4793610iob.18.2023.02.08.11.24.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Feb 2023 11:24:11 -0800 (PST) Date: Wed, 8 Feb 2023 14:24:09 -0500 From: Mark Johnston To: John Baldwin Cc: Corvin =?iso-8859-1?Q?K=F6hne?= , freebsd-virtualization@freebsd.org Subject: Re: bhyve 13.1 compatibility breakage Message-ID: References: <202211230800.2AN80G58068419@gitrepo.freebsd.org> <8aba2bc4-93da-44d7-1d14-8914c4111190@FreeBSD.org> List-Id: Discussion List-Archive: https://lists.freebsd.org/archives/freebsd-virtualization List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-virtualization@freebsd.org X-BeenThere: freebsd-virtualization@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <8aba2bc4-93da-44d7-1d14-8914c4111190@FreeBSD.org> X-Rspamd-Queue-Id: 4PBqgK4rV2z4cSy X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US] X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-ThisMailContainsUnwantedMimeParts: N On Wed, Feb 08, 2023 at 11:08:31AM -0800, John Baldwin wrote: > On 2/8/23 10:05 AM, Mark Johnston wrote: > > On Sun, Jan 15, 2023 at 10:46:21AM -0500, Mark Johnston wrote: > >> On Wed, Nov 23, 2022 at 08:00:16AM +0000, Corvin Köhne wrote: > >>> The branch main has been updated by corvink: > >>> > >>> URL: https://cgit.FreeBSD.org/src/commit/?id=7c326ab5bb9aced8dcbc2465ac1c9ff8df2ba46b > >>> > >>> commit 7c326ab5bb9aced8dcbc2465ac1c9ff8df2ba46b > >>> Author: Corvin Köhne > >>> AuthorDate: 2022-11-21 14:00:04 +0000 > >>> Commit: Corvin Köhne > >>> CommitDate: 2022-11-23 08:00:04 +0000 > >>> > >>> vmm: don't lock a mtx in the icr_low write handler > >>> > >>> x2apic accesses are handled by a wrmsr exit. This handler is called in a > >>> critical section. So, we can't lock a mtx in the icr_low handler. > >>> > >>> Reported by: kp, pho > >>> Tested by: kp, pho > >>> Approved by: manu (mentor) > >>> Fixes: c0f35dbf19c3c8825bd2b321d8efd582807d1940 vmm: Use a cpuset_t for vCPUs waiting for STARTUP IPIs. > >>> MFC after: 1 week > >>> MFC with: c0f35dbf19c3c8825bd2b321d8efd582807d1940 > >>> Sponsored by: Beckhoff Automation GmbH & Co. KG > >>> Differential Revision: https://reviews.freebsd.org/D37452 > >>> --- > >> > >> Hi Corvin, > >> > >> This seems to break AP startup when using bhyve/libvmmapi from FreeBSD > >> 13.1 on a kernel built from main. It looks like the commit somehow > >> regresses commit 769b884e2e2, but I'm not sure yet exactly how. > > > > I debugged this further and am not quite sure how to fix the problem, > > which isn't specific to this commit after all. I'll try to describe it > > here. > > > > Suppose we're booting a VM with 2 vCPUs. When the BSP raises the INIT > > IPI to start AP 1, vlapic_icrlo_write_handler() looks up the destination > > vCPU with vlapic_calcdest(), which only returns active vCPUs. However, > > old bhyve executables activate APs (i.e., call vm_activate_cpu()) > > lazily, only upon receiving a VM_EXITCODE_SPINUP_AP message. Thus, > > vm_handle_ipi() simply doesn't doesn't do anything since "dmask" is > > empty, so APs don't boot up. > > > > To further complicate things, new vmm.ko allocates vCPUs lazily. New > > bhyve executables call vm_activate_cpu() for all vCPUs before running > > the BSP, but as I said above, old bhyve executables do not. So merely > > fixing "dmask" in vlapic_icrlo_write_handler() to include > > not-yet-activated vCPUs doesn't work, and we can't allocate a new vCPU > > in that context. In general it seems that we want an INIT IPI to > > trigger allocation of a vcpu structure to preserve compatibility with > > old bhyve, but I don't see a good way to implement that. > > > > I would quite like to fix this since I make heavy use of 13.1-RELEASE > > bhyve+jails on a kernel running main. I believe bhyve from stable/13 is > > unaffected, but 13.2 is not yet released. Any suggestions would be > > appreciated. > > Hmm, I thought I had fixed this by using the bitmask of started CPUs > rather than requiring the vCPU to be allocated. I was definitely testing > an old bhyve binary from head against the vCPU branch while working on it > and remember hitting this exact case, but I thought I had fixed it. Oh, > hmm, my fix was the commit quoted above (769b884e2e2). After looking further, I think this is actually not so painful to fix. I have a patch which seems to work, but I need to test further. Lazy allocation of the vcpu structure is mostly fine. The only problem is that the INIT handler in vm_handle_ipi() wants to reinitialize the vlapic on all destination vCPUs and this doesn't work if any are not yet allocated. But if they're not yet allocated, then we can rely on a later vm_alloc_vcpu() to initialize vlapic state, so it doesn't really matter. So if I change vlapic_calcdest() to report inactive destination CPUs for INIT and STARTUP IPIs, and I change vm_handle_ipi() to only invoke vlapic_handle_init() on active vCPUs, and I fix a small bug in this commit (CPU_FFS() is 1-indexed not 0-indexed) then I can boot multicore VMs using a 13.1-RELEASE bhyve again.