From nobody Fri Feb 04 12:01:45 2022 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id B94D6198CAC0; Fri, 4 Feb 2022 12:01:46 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4JqvK60JNkz4ZkJ; Fri, 4 Feb 2022 12:01:46 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1643976106; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=X9fGMvRfls0X7/0l8aQG+lbAkdOT4GVY62JfXT9pyDw=; b=vlGxDGqGwvY1irjqO/WFViLxXPHrcR1HkObE75kKoZCvH4wF+Ao46jXJTvUyzCqEUybQzA J/LYH1qBi7xzS0ueOw2kJB8HUyJc7KYAKTcRcIMPyHk60n9c2kDd0XBu4Yq6MB7jjlV7/I nwac3gf6MYt44Js2mZ8ZjnCkaoiDAxYnr/UMjQLCLxTXA7bO8bTW37+pHBwbujF6g5eZtV dtZlZuOGo/F9lqKxIp3/tnjo+Bc76kkfzYGsxRRgSKMkrXacsIg+etCbqzSxjIaPcEE0vS mn5W+NaLAJwUHcaLVPiGAVgc3PNycoHlqeXdS0G/zVW4Q+5uRXlGB/JN1OiKsQ== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id B3DAC30E6; Fri, 4 Feb 2022 12:01:45 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 214C1jAI061144; Fri, 4 Feb 2022 12:01:45 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 214C1jei061131; Fri, 4 Feb 2022 12:01:45 GMT (envelope-from git) Date: Fri, 4 Feb 2022 12:01:45 GMT Message-Id: <202202041201.214C1jei061131@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Konstantin Belousov Subject: git: 9c0b759bf9b5 - main - x86 atomics: use lock prefix unconditionally List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: kib X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 9c0b759bf9b520537616d026f21a0a98d70acd11 Auto-Submitted: auto-generated ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1643976106; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=X9fGMvRfls0X7/0l8aQG+lbAkdOT4GVY62JfXT9pyDw=; b=V8UKEZWU5Q8KWGooCfzIF2wtE2R4ldlTgmkOQgFuzTnE0SgegImkRzkNCagWjRt1N+XiOi cIZm9y5SDhyB0RyqvjiVebKg95YU/167WV68YULosqHu8W2zhdt9OFa6KOq8hgOkZ/RWIP /kXmm2SIpP3vE9JDJ82T20WGTBgEsS3pb/tiapQQ1rJqDsIblYPZp2X9uRCeGItplNIQET gemUvvZKygajkwNHXgxWkvhiuQHWgCpQEGUWOhmngGf/EndXdrrF2PBx9xknfCndD9FWLe 4covPE0Fc8R8FjTJJdwMCCwcS2NrNxFBKhu257NGdXhCL8FRA1LUWtTBaDcF6g== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1643976106; a=rsa-sha256; cv=none; b=D2ox8pJk7ftcGB5NKe8erDNZ+8cxENNLTpAY5pHp2vjQLKVjSwZ2bD7qrRqkmI/bk7k3wk CwzenXSkcjznBhzmj1UUmctB6IrnqTohtU2TpIHCPwiruk7qmJAWjSiZXtREOBejQPiw7L c84d9LzBYxjN7Nsk5fAwrSWbuQI3kw9/4TgV5iNEFKFcMx54ZcNpdG3lWqsQxyygZIaCnm /3s0gPYK8Vg673kbICXtA4WOF/mMd29Fr/HVgnaT8+zw7rxGTsAE5RIkCrCryHtPEqa6TK B/oDQ6VduQEVIof1bRNrjhPI0iS2zdljnQm5sOfdO6q76LtG0233M/cjcVjWUQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N The branch main has been updated by kib: URL: https://cgit.FreeBSD.org/src/commit/?id=9c0b759bf9b520537616d026f21a0a98d70acd11 commit 9c0b759bf9b520537616d026f21a0a98d70acd11 Author: Konstantin Belousov AuthorDate: 2022-02-03 09:51:36 +0000 Commit: Konstantin Belousov CommitDate: 2022-02-04 12:01:39 +0000 x86 atomics: use lock prefix unconditionally Atomics have significant other use besides providing in-system primitives for safe memory updates. They are used for implementing communication with out of system software or hardware following some protocols. For instance, even UP kernel might require a protocol using atomics to communicate with the software-emulated device on SMP hypervisor. Or real hardware might need atomic accesses as part of the proper management protocol. Another point is that UP configurations on x86 are extinct, so slight performance hit by unconditionally use proper atomics is not important. It is compensated by less code clutter, which in fact improves the UP/i386 lifetime expectations. Requested by: Elliott Mitchell Reviewed by: Elliott Mitchell, imp, jhb, markj, royger Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D34153 --- sys/amd64/include/atomic.h | 67 +++++++++++----------------------------------- sys/i386/include/atomic.h | 52 +++++++++++------------------------ 2 files changed, 32 insertions(+), 87 deletions(-) diff --git a/sys/amd64/include/atomic.h b/sys/amd64/include/atomic.h index ed80c426add2..f0191970e8cc 100644 --- a/sys/amd64/include/atomic.h +++ b/sys/amd64/include/atomic.h @@ -143,16 +143,10 @@ void atomic_store_rel_##TYPE(volatile u_##TYPE *p, u_##TYPE v) #else /* !__GNUCLIKE_ASM */ /* - * For userland, always use lock prefixes so that the binaries will run - * on both SMP and !SMP systems. - */ -#if defined(SMP) || !defined(_KERNEL) || defined(KLD_MODULE) -#define MPLOCKED "lock ; " -#else -#define MPLOCKED -#endif - -/* + * Always use lock prefixes. The result is slighly less optimal for + * UP systems, but it matters less now, and sometimes UP is emulated + * over SMP. + * * The assembly is volatilized to avoid code chunk removal by the compiler. * GCC aggressively reorders operations and memory clobbering is necessary * in order to avoid that for memory barriers. @@ -161,7 +155,7 @@ void atomic_store_rel_##TYPE(volatile u_##TYPE *p, u_##TYPE v) static __inline void \ atomic_##NAME##_##TYPE(volatile u_##TYPE *p, u_##TYPE v)\ { \ - __asm __volatile(MPLOCKED OP \ + __asm __volatile("lock; " OP \ : "+m" (*p) \ : CONS (V) \ : "cc"); \ @@ -170,7 +164,7 @@ atomic_##NAME##_##TYPE(volatile u_##TYPE *p, u_##TYPE v)\ static __inline void \ atomic_##NAME##_barr_##TYPE(volatile u_##TYPE *p, u_##TYPE v)\ { \ - __asm __volatile(MPLOCKED OP \ + __asm __volatile("lock; " OP \ : "+m" (*p) \ : CONS (V) \ : "memory", "cc"); \ @@ -199,8 +193,7 @@ atomic_cmpset_##TYPE(volatile u_##TYPE *dst, u_##TYPE expect, u_##TYPE src) \ u_char res; \ \ __asm __volatile( \ - " " MPLOCKED " " \ - " cmpxchg %3,%1 ; " \ + " lock; cmpxchg %3,%1 ; " \ "# atomic_cmpset_" #TYPE " " \ : "=@cce" (res), /* 0 */ \ "+m" (*dst), /* 1 */ \ @@ -216,8 +209,7 @@ atomic_fcmpset_##TYPE(volatile u_##TYPE *dst, u_##TYPE *expect, u_##TYPE src) \ u_char res; \ \ __asm __volatile( \ - " " MPLOCKED " " \ - " cmpxchg %3,%1 ; " \ + " lock; cmpxchg %3,%1 ; " \ "# atomic_fcmpset_" #TYPE " " \ : "=@cce" (res), /* 0 */ \ "+m" (*dst), /* 1 */ \ @@ -241,8 +233,7 @@ atomic_fetchadd_int(volatile u_int *p, u_int v) { __asm __volatile( - " " MPLOCKED " " - " xaddl %0,%1 ; " + " lock; xaddl %0,%1 ; " "# atomic_fetchadd_int" : "+r" (v), /* 0 */ "+m" (*p) /* 1 */ @@ -259,8 +250,7 @@ atomic_fetchadd_long(volatile u_long *p, u_long v) { __asm __volatile( - " " MPLOCKED " " - " xaddq %0,%1 ; " + " lock; xaddq %0,%1 ; " "# atomic_fetchadd_long" : "+r" (v), /* 0 */ "+m" (*p) /* 1 */ @@ -274,8 +264,7 @@ atomic_testandset_int(volatile u_int *p, u_int v) u_char res; __asm __volatile( - " " MPLOCKED " " - " btsl %2,%1 ; " + " lock; btsl %2,%1 ; " "# atomic_testandset_int" : "=@ccc" (res), /* 0 */ "+m" (*p) /* 1 */ @@ -290,8 +279,7 @@ atomic_testandset_long(volatile u_long *p, u_int v) u_char res; __asm __volatile( - " " MPLOCKED " " - " btsq %2,%1 ; " + " lock; btsq %2,%1 ; " "# atomic_testandset_long" : "=@ccc" (res), /* 0 */ "+m" (*p) /* 1 */ @@ -306,8 +294,7 @@ atomic_testandclear_int(volatile u_int *p, u_int v) u_char res; __asm __volatile( - " " MPLOCKED " " - " btrl %2,%1 ; " + " lock; btrl %2,%1 ; " "# atomic_testandclear_int" : "=@ccc" (res), /* 0 */ "+m" (*p) /* 1 */ @@ -322,8 +309,7 @@ atomic_testandclear_long(volatile u_long *p, u_int v) u_char res; __asm __volatile( - " " MPLOCKED " " - " btrq %2,%1 ; " + " lock; btrq %2,%1 ; " "# atomic_testandclear_long" : "=@ccc" (res), /* 0 */ "+m" (*p) /* 1 */ @@ -344,39 +330,18 @@ atomic_testandclear_long(volatile u_long *p, u_int v) * special address for "mem". In the kernel, we use a private per-cpu * cache line. In user space, we use a word in the stack's red zone * (-8(%rsp)). - * - * For UP kernels, however, the memory of the single processor is - * always consistent, so we only need to stop the compiler from - * reordering accesses in a way that violates the semantics of acquire - * and release. */ -#if defined(_KERNEL) - -#if defined(SMP) || defined(KLD_MODULE) static __inline void __storeload_barrier(void) { - +#if defined(_KERNEL) __asm __volatile("lock; addl $0,%%gs:%0" : "+m" (*(u_int *)OFFSETOF_MONITORBUF) : : "memory", "cc"); -} -#else /* _KERNEL && UP */ -static __inline void -__storeload_barrier(void) -{ - - __compiler_membar(); -} -#endif /* SMP */ #else /* !_KERNEL */ -static __inline void -__storeload_barrier(void) -{ - __asm __volatile("lock; addl $0,-8(%%rsp)" : : : "memory", "cc"); -} #endif /* _KERNEL*/ +} #define ATOMIC_LOAD(TYPE) \ static __inline u_##TYPE \ diff --git a/sys/i386/include/atomic.h b/sys/i386/include/atomic.h index c65a7ac83a93..2e72de935c23 100644 --- a/sys/i386/include/atomic.h +++ b/sys/i386/include/atomic.h @@ -141,16 +141,10 @@ void atomic_subtract_64(volatile uint64_t *, uint64_t); #else /* !__GNUCLIKE_ASM */ /* - * For userland, always use lock prefixes so that the binaries will run - * on both SMP and !SMP systems. - */ -#if defined(SMP) || !defined(_KERNEL) || defined(KLD_MODULE) -#define MPLOCKED "lock ; " -#else -#define MPLOCKED -#endif - -/* + * Always use lock prefixes. The result is slighly less optimal for + * UP systems, but it matters less now, and sometimes UP is emulated + * over SMP. + * * The assembly is volatilized to avoid code chunk removal by the compiler. * GCC aggressively reorders operations and memory clobbering is necessary * in order to avoid that for memory barriers. @@ -159,7 +153,7 @@ void atomic_subtract_64(volatile uint64_t *, uint64_t); static __inline void \ atomic_##NAME##_##TYPE(volatile u_##TYPE *p, u_##TYPE v)\ { \ - __asm __volatile(MPLOCKED OP \ + __asm __volatile("lock; " OP \ : "+m" (*p) \ : CONS (V) \ : "cc"); \ @@ -168,7 +162,7 @@ atomic_##NAME##_##TYPE(volatile u_##TYPE *p, u_##TYPE v)\ static __inline void \ atomic_##NAME##_barr_##TYPE(volatile u_##TYPE *p, u_##TYPE v)\ { \ - __asm __volatile(MPLOCKED OP \ + __asm __volatile("lock; " OP \ : "+m" (*p) \ : CONS (V) \ : "memory", "cc"); \ @@ -197,8 +191,7 @@ atomic_cmpset_##TYPE(volatile u_##TYPE *dst, u_##TYPE expect, u_##TYPE src) \ u_char res; \ \ __asm __volatile( \ - " " MPLOCKED " " \ - " cmpxchg %3,%1 ; " \ + " lock; cmpxchg %3,%1 ; " \ " sete %0 ; " \ "# atomic_cmpset_" #TYPE " " \ : "=q" (res), /* 0 */ \ @@ -215,8 +208,7 @@ atomic_fcmpset_##TYPE(volatile u_##TYPE *dst, u_##TYPE *expect, u_##TYPE src) \ u_char res; \ \ __asm __volatile( \ - " " MPLOCKED " " \ - " cmpxchg %3,%1 ; " \ + " lock; cmpxchg %3,%1 ; " \ " sete %0 ; " \ "# atomic_fcmpset_" #TYPE " " \ : "=q" (res), /* 0 */ \ @@ -240,8 +232,7 @@ atomic_fetchadd_int(volatile u_int *p, u_int v) { __asm __volatile( - " " MPLOCKED " " - " xaddl %0,%1 ; " + " lock; xaddl %0,%1 ; " "# atomic_fetchadd_int" : "+r" (v), /* 0 */ "+m" (*p) /* 1 */ @@ -255,8 +246,7 @@ atomic_testandset_int(volatile u_int *p, u_int v) u_char res; __asm __volatile( - " " MPLOCKED " " - " btsl %2,%1 ; " + " lock; btsl %2,%1 ; " " setc %0 ; " "# atomic_testandset_int" : "=q" (res), /* 0 */ @@ -272,8 +262,7 @@ atomic_testandclear_int(volatile u_int *p, u_int v) u_char res; __asm __volatile( - " " MPLOCKED " " - " btrl %2,%1 ; " + " lock; btrl %2,%1 ; " " setc %0 ; " "# atomic_testandclear_int" : "=q" (res), /* 0 */ @@ -303,11 +292,7 @@ atomic_testandclear_int(volatile u_int *p, u_int v) */ #if defined(_KERNEL) -#if defined(SMP) || defined(KLD_MODULE) #define __storeload_barrier() __mbk() -#else /* _KERNEL && UP */ -#define __storeload_barrier() __compiler_membar() -#endif /* SMP */ #else /* !_KERNEL */ #define __storeload_barrier() __mbu() #endif /* _KERNEL*/ @@ -484,8 +469,7 @@ atomic_cmpset_64_i586(volatile uint64_t *dst, uint64_t expect, uint64_t src) u_char res; __asm __volatile( - " " MPLOCKED " " - " cmpxchg8b %1 ; " + " lock; cmpxchg8b %1 ; " " sete %0" : "=q" (res), /* 0 */ "+m" (*dst), /* 1 */ @@ -502,8 +486,7 @@ atomic_fcmpset_64_i586(volatile uint64_t *dst, uint64_t *expect, uint64_t src) u_char res; __asm __volatile( - " " MPLOCKED " " - " cmpxchg8b %1 ; " + " lock; cmpxchg8b %1 ; " " sete %0" : "=q" (res), /* 0 */ "+m" (*dst), /* 1 */ @@ -522,8 +505,7 @@ atomic_load_acq_64_i586(volatile uint64_t *p) __asm __volatile( " movl %%ebx,%%eax ; " " movl %%ecx,%%edx ; " - " " MPLOCKED " " - " cmpxchg8b %1" + " lock; cmpxchg8b %1" : "=&A" (res), /* 0 */ "+m" (*p) /* 1 */ : : "memory", "cc"); @@ -538,8 +520,7 @@ atomic_store_rel_64_i586(volatile uint64_t *p, uint64_t v) " movl %%eax,%%ebx ; " " movl %%edx,%%ecx ; " "1: " - " " MPLOCKED " " - " cmpxchg8b %0 ; " + " lock; cmpxchg8b %0 ; " " jne 1b" : "+m" (*p), /* 0 */ "+A" (v) /* 1 */ @@ -554,8 +535,7 @@ atomic_swap_64_i586(volatile uint64_t *p, uint64_t v) " movl %%eax,%%ebx ; " " movl %%edx,%%ecx ; " "1: " - " " MPLOCKED " " - " cmpxchg8b %0 ; " + " lock; cmpxchg8b %0 ; " " jne 1b" : "+m" (*p), /* 0 */ "+A" (v) /* 1 */