From nobody Sun Jul 17 10:16:24 2022 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Lm1GR5Hr4z4WrLc; Sun, 17 Jul 2022 10:16:31 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-qk1-x731.google.com (mail-qk1-x731.google.com [IPv6:2607:f8b0:4864:20::731]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Lm1GP31yCz3KmW; Sun, 17 Jul 2022 10:16:29 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-qk1-x731.google.com with SMTP id m16so2549352qka.12; Sun, 17 Jul 2022 03:16:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=DMXx9wYP1u1DNPCNDY0hnn6FAPp1ZkidPjTVivRhdu8=; b=W9+gwssFxisFbdNbv5zDP0wpFlvnwNiyNf1uRto9PEozGbAen6ycdlgt41sqb4qtC7 Vpmd4Q2uFrcGoaUwaSLRAwQJBvOnJm8MYJ7d2yY67l8Id394DCJawZ39z4AAi7koxrqF tPPg0RzMUHx5B+pmKJecgx+Gl3BDNVcveOTV75NymisitJMJ1K7cM8FinPaBm5oVfYF5 BXjyCx+mDQ3WemZ1/AuKlHnQjhCagxDhFK0vVXMabHUDTyJJBMddVHWM1SxXfE04UjZI Vq807fV9J4gQ/DUzl3w2su953tnsxXCqhWzMRk+3t6IcE9HsuX6c4Wm+kuawN0V9wdE5 QZSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition :content-transfer-encoding:in-reply-to; bh=DMXx9wYP1u1DNPCNDY0hnn6FAPp1ZkidPjTVivRhdu8=; b=n3fm1FmtUW4/n179qIhTGujZbax7UWpZMzpdMIEtdBQYJT8jBBplUEVSIBUzLDSqTm I0Rs4O2aymo8d3yy15ZvvLpTq7AflhGbKz/6znNXVL6vaFlukV4yHhUamBIzsNmF849h A26nDiUOuRcG4iXtEFmapmDikRJi9aQsdmDnV0GuIbTn5mw5IYCLVbAm/pSS6kBbfS5I 2mkfNv77I8pHmVH8LfJk9ZN4sBH/NsfoCqW15C+qWBtrMeE0zVfrMPcHr0czzyuOnqx4 WDOUlDDgMRFP0OyN7QwPOLaR9lJ97C8hAM0BQJT1ntUXLuGrzkG8DLAmc8aXpUOJkWLM ncbg== X-Gm-Message-State: AJIora8Aoy4YhlKVe3b3nNeiJGeoJxNGDL7Yv1lFa/eg9Hnh4QgELRUq tauddSX4OnxbVds2xWTDewsdnUU7TLc= X-Google-Smtp-Source: AGRyM1umv3/SeXeINn51S0dzu11qiZct88Ai8Ps5VlCB9HeLQ7WUjG54OtQSz2StiiGEloledDaTvQ== X-Received: by 2002:a05:620a:22fa:b0:6af:6c8c:32e8 with SMTP id p26-20020a05620a22fa00b006af6c8c32e8mr14776734qki.15.1658052988282; Sun, 17 Jul 2022 03:16:28 -0700 (PDT) Received: from nuc (192-0-220-237.cpe.teksavvy.com. [192.0.220.237]) by smtp.gmail.com with ESMTPSA id do16-20020a05620a2b1000b006b5ab88e544sm8846372qkb.124.2022.07.17.03.16.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 17 Jul 2022 03:16:27 -0700 (PDT) Date: Sun, 17 Jul 2022 06:16:24 -0400 From: Mark Johnston To: =?utf-8?Q?T=C4=B3l?= Coosemans Cc: src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org Subject: Re: git: 03f868b163ad - main - x86: Add a required store-load barrier in cpu_idle() Message-ID: References: <202207141447.26EElbAZ041128@gitrepo.freebsd.org> <20220717113143.358f28b2@FreeBSD.org> List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-main@freebsd.org X-BeenThere: dev-commits-src-main@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20220717113143.358f28b2@FreeBSD.org> X-Rspamd-Queue-Id: 4Lm1GP31yCz3KmW X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20210112 header.b=W9+gwssF; dmarc=none; spf=pass (mx1.freebsd.org: domain of markjdb@gmail.com designates 2607:f8b0:4864:20::731 as permitted sender) smtp.mailfrom=markjdb@gmail.com X-Spamd-Result: default: False [-2.67 / 15.00]; NEURAL_HAM_LONG(-1.00)[-0.998]; NEURAL_HAM_SHORT(-0.99)[-0.992]; NEURAL_HAM_MEDIUM(-0.98)[-0.984]; MID_RHS_NOT_FQDN(0.50)[]; FORGED_SENDER(0.30)[markj@freebsd.org,markjdb@gmail.com]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36:c]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20210112]; MIME_GOOD(-0.10)[text/plain]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::731:from]; BLOCKLISTDE_FAIL(0.00)[192.0.220.237:server fail]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DMARC_NA(0.00)[freebsd.org]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; MLMMJ_DEST(0.00)[dev-commits-src-all@freebsd.org,dev-commits-src-main@freebsd.org]; DKIM_TRACE(0.00)[gmail.com:+]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_SOME(0.00)[]; ARC_NA(0.00)[]; RCVD_TLS_LAST(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+]; FROM_NEQ_ENVFROM(0.00)[markj@freebsd.org,markjdb@gmail.com]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; RCVD_VIA_SMTP_AUTH(0.00)[] X-ThisMailContainsUnwantedMimeParts: N On Sun, Jul 17, 2022 at 11:31:43AM +0200, Tijl Coosemans wrote: > On Thu, 14 Jul 2022 14:47:37 GMT Mark Johnston wrote: > > The branch main has been updated by markj: > > > > URL: https://cgit.FreeBSD.org/src/commit/?id=03f868b163ad46d6f7cb03dc46fb83ca01fb8f69 > > > > commit 03f868b163ad46d6f7cb03dc46fb83ca01fb8f69 > > Author: Mark Johnston > > AuthorDate: 2022-07-14 14:24:25 +0000 > > Commit: Mark Johnston > > CommitDate: 2022-07-14 14:28:01 +0000 > > > > x86: Add a required store-load barrier in cpu_idle() > > > > ULE's tdq_notify() tries to avoid delivering IPIs to the idle thread. > > In particular, it tries to detect whether the idle thread is running. > > There are two mechanisms for this: > > - tdq_cpu_idle, an MI flag which is set prior to calling cpu_idle(). If > > tdq_cpu_idle == 0, then no IPI is needed; > > - idle_state, an x86-specific state flag which is updated after > > cpu_idleclock() is called. > > > > The implementation of the second mechanism is racy; the race can cause a > > CPU to go to sleep with pending work. Specifically, cpu_idle_*() set > > idle_state = STATE_SLEEPING, then check for pending work by loading the > > tdq_load field of the CPU's runqueue. These operations can be reordered > > so that the idle thread observes tdq_load == 0, and tdq_notify() > > observes idle_state == STATE_RUNNING. > > > > Some counters indicate that the idle_state check in tdq_notify() > > frequently elides an IPI. So, fix the problem by inserting a fence > > after the store to idle_state, immediately before idling the CPU. > > > > PR: 264867 > > Reviewed by: mav, kib, jhb > > MFC after: 1 month > > Sponsored by: The FreeBSD Foundation > > Differential Revision: https://reviews.freebsd.org/D35777 > > --- > > sys/x86/x86/cpu_machdep.c | 103 ++++++++++++++++++++++++++++------------------ > > 1 file changed, 62 insertions(+), 41 deletions(-) > > > > diff --git a/sys/x86/x86/cpu_machdep.c b/sys/x86/x86/cpu_machdep.c > > index fa11f64e2779..040438043c73 100644 > > --- a/sys/x86/x86/cpu_machdep.c > > +++ b/sys/x86/x86/cpu_machdep.c > > @@ -52,6 +52,7 @@ __FBSDID("$FreeBSD$"); > > #include "opt_maxmem.h" > > #include "opt_mp_watchdog.h" > > #include "opt_platform.h" > > +#include "opt_sched.h" > > #ifdef __i386__ > > #include "opt_apic.h" > > #endif > > @@ -532,32 +533,25 @@ static int idle_mwait = 1; /* Use MONITOR/MWAIT for short idle. */ > > SYSCTL_INT(_machdep, OID_AUTO, idle_mwait, CTLFLAG_RWTUN, &idle_mwait, > > 0, "Use MONITOR/MWAIT for short idle"); > > > > -static void > > -cpu_idle_acpi(sbintime_t sbt) > > +static bool > > +cpu_idle_enter(int *statep, int newstate) > > { > > - int *state; > > + KASSERT(atomic_load_int(statep) == STATE_RUNNING, > > + ("%s: state %d", __func__, atomic_load_int(statep))); > > > > - state = &PCPU_PTR(monitorbuf)->idle_state; > > - atomic_store_int(state, STATE_SLEEPING); > > - > > - /* See comments in cpu_idle_hlt(). */ > > - disable_intr(); > > - if (sched_runnable()) > > - enable_intr(); > > - else if (cpu_idle_hook) > > - cpu_idle_hook(sbt); > > - else > > - acpi_cpu_c1(); > > - atomic_store_int(state, STATE_RUNNING); > > -} > > - > > -static void > > -cpu_idle_hlt(sbintime_t sbt) > > -{ > > - int *state; > > - > > - state = &PCPU_PTR(monitorbuf)->idle_state; > > - atomic_store_int(state, STATE_SLEEPING); > > + /* > > + * A fence is needed to prevent reordering of the load in > > + * sched_runnable() with this store to the idle state word. Without it, > > + * cpu_idle_wakeup() can observe the state as STATE_RUNNING after having > > + * added load to the queue, and elide an IPI. Then, sched_runnable() > > + * can observe tdq_load == 0, so the CPU ends up idling with pending > > + * work. tdq_notify() similarly ensures that a prior update to tdq_load > > + * is visible before calling cpu_idle_wakeup(). > > + */ > > + atomic_store_int(statep, newstate); > > +#if defined(SCHED_ULE) && defined(SMP) > > + atomic_thread_fence_seq_cst(); > > +#endif > > Can't you combine the store and the fence with > atomic_store_explicit(memory_order_seq_cst) or, since this is x86 > specific code, atomic_swap_int? > > #if defined(SCHED_ULE) && defined(SMP) > atomic_swap_int(statep, newstate); > #else > atomic_store_int(statep, newstate); > #endif Yes, I believe so. I wrote it that way initially, but decided it was nicer to use the same fence as the MI code with which this synchronizes. What's the advantage of using xchg instead, given that the CPU is about to idle? > > /* > > * Since we may be in a critical section from cpu_idle(), if > > @@ -576,35 +570,62 @@ cpu_idle_hlt(sbintime_t sbt) > > * interrupt. > > */ > > disable_intr(); > > - if (sched_runnable()) > > + if (sched_runnable()) { > > enable_intr(); > > - else > > - acpi_cpu_c1(); > > - atomic_store_int(state, STATE_RUNNING); > > + atomic_store_int(statep, STATE_RUNNING); > > + return (false); > > + } else { > > + return (true); > > + } > > } > > > > static void > > -cpu_idle_mwait(sbintime_t sbt) > > +cpu_idle_exit(int *statep) > > +{ > > + atomic_store_int(statep, STATE_RUNNING); > > +} > > > +static void > > +cpu_idle_hlt(sbintime_t sbt) > > +{ > > + int *state; > > + > > + state = &PCPU_PTR(monitorbuf)->idle_state; > > + if (cpu_idle_enter(state, STATE_SLEEPING)) { > > + acpi_cpu_c1(); > > atomic_store_int(state, STATE_RUNNING); > > Should this call be replaced with cpu_idle_exit(state), just for > consistency? Yes, that would be a bit better. > > - enable_intr(); > > - return; > > } > > +}