From nobody Mon Dec 04 18:59:31 2023 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4SkXyr26Gbz52h4l; Mon, 4 Dec 2023 18:59:32 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4SkXyr0yDTz3TbF; Mon, 4 Dec 2023 18:59:32 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1701716372; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=+1ChECq2wpTYnzPE6jTmavId3FdwhdiHKJP/IM5U4YY=; b=qulhZNbof+bhGbXXVRQ1RdRIvG1FzbhdwD0H+IjnD7XI4I6sto4z8l6AC8hknNNHeyi/gy vt9vUEnvLoebkUOqy4yz5IT0xTMPshOcpsZvqCsCD1YMWtmgCPmJhSvpMP44o47ESdau86 brPFfZgOeqUuIv07VGrW+z7gwq0RFSTDiTAUC54GxJODrXis/zhhGnO0qBl6nj/x+IhSDL VsWhuNiZrdAkXtTssZY1p1coJuMXbUaNLtUwRCy0iWjhaB5cYKlT6oU7sllFCRHEiOs4WP q+zq0tCvpbJ0JwMhXGt+vRfky2eMP8kEfEEmYrPYAcgyf3qdSGd53CPTMb93Ew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1701716372; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=+1ChECq2wpTYnzPE6jTmavId3FdwhdiHKJP/IM5U4YY=; b=r59NwWGnROG244YXv5PN+xBhqs6vXTTsfu1VNvVBR+QjWuTteS0GeHrYTWoSlRRKxo8Dmt hOiMPelVCh+Lrp/C7THAW4W3H3NI4xEueSPx5WXQ/8+Slb6ilE1HwzodNRevqtLfGm11hz HZJht0/iN8wWi4v+UNcfb/YsD2rrHCJAx6028GXwp0IaJWsOsQTf2c7nJOjGLyuCZJgcp6 SoVUBDRRiQfTyf14uE4UnDtCdFhiCI6+G0YZw92ajbYlHBxd682owSfLH6866G9wCXnIgG VWS8YUTJ8zaKZTj5aE4I+oBJlWIA6BZqmtA0WNBVYEFNQ9tLanO4KSM2lmB/Cg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1701716372; a=rsa-sha256; cv=none; b=Bw8HLOhfvU5N44EmU8H5XTZwdsY34ht/0SwKq4VTI3Ob5b0FMQJ5zzdRdiSxBwKqhEsKaj hKYoOAjScTAKYINOgaXQA0F5Ah4ZrTE/6vk/tXu5Ux/zhUQgvSzQLCH+bPiAeqpNrdRBI6 8RabuvM3PJ8PAyOpZK0tVl+RhGhdjvHU+q1yKTgSWfrKKv/zWIC/bCM6xA9VYG3iyrkp3J YyBtlGm5Y9TymjoQEE1l707xaq/gvPRKMQprljfn8c4Z7BrwVpZszdnAIfvjguVYSk63eP hwdZepy0OcqQvv3blEqBKs/Eth+wcmQ38bSElcLuG7FaGkqTaJNfI9xEXZIR6A== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4SkXyq6M5szdnV; Mon, 4 Dec 2023 18:59:31 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.17.1/8.17.1) with ESMTP id 3B4IxVvC043752; Mon, 4 Dec 2023 18:59:31 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.17.1/8.17.1/Submit) id 3B4IxVcB043749; Mon, 4 Dec 2023 18:59:31 GMT (envelope-from git) Date: Mon, 4 Dec 2023 18:59:31 GMT Message-Id: <202312041859.3B4IxVcB043749@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Gleb Smirnoff Subject: git: e3cbc572f154 - main - kern/subr_trap.c: repair the HPTS performance hack in userret() List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: glebius X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: e3cbc572f1541fdc18be9971d23e210d5018e662 Auto-Submitted: auto-generated The branch main has been updated by glebius: URL: https://cgit.FreeBSD.org/src/commit/?id=e3cbc572f1541fdc18be9971d23e210d5018e662 commit e3cbc572f1541fdc18be9971d23e210d5018e662 Author: Gleb Smirnoff AuthorDate: 2023-12-04 18:19:46 +0000 Commit: Gleb Smirnoff CommitDate: 2023-12-04 18:19:46 +0000 kern/subr_trap.c: repair the HPTS performance hack in userret() It wasn't functional as subr_trap.c doesn't include opt_inet.h. Put a better comment provided by gallatin@ in place of the old one. The idea is to use userret() as a cheap place to call a soft clock. This approach saves CPU on busy machines and saves power on idle machines. An alternative would be to constantly schedule callouts. Running with neither callouts nor the soft clock ruins HPTS precision. Reviewed by: tuexen, rrs Differential Revision: https://reviews.freebsd.org/D42860 --- sys/kern/subr_trap.c | 20 ++++++++++++-------- sys/netinet/tcp_hpts.h | 1 - sys/netinet/tcp_lro.c | 4 +--- sys/sys/systm.h | 6 ++++++ 4 files changed, 19 insertions(+), 12 deletions(-) diff --git a/sys/kern/subr_trap.c b/sys/kern/subr_trap.c index 8720d9f71c1c..e9a16cd0b36e 100644 --- a/sys/kern/subr_trap.c +++ b/sys/kern/subr_trap.c @@ -74,6 +74,8 @@ #include #endif +void (*tcp_hpts_softclock)(void); + /* * Define the code needed before returning to user mode, for trap and * syscall. @@ -125,16 +127,18 @@ userret(struct thread *td, struct trapframe *frame) if (PMC_THREAD_HAS_SAMPLES(td)) PMC_CALL_HOOK(td, PMC_FN_THR_USERRET, NULL); #endif -#ifdef TCPHPTS /* - * @gallatin is adament that this needs to go here, I - * am not so sure. Running hpts is a lot like - * a lro_flush() that happens while a user process - * is running. But he may know best so I will go - * with his view of accounting. :-) + * Calling tcp_hpts_softclock() here allows us to avoid frequent, + * expensive callouts that trash the cache and lead to a much higher + * number of interrupts and context switches. Testing on busy web + * servers at Netflix has shown that this improves CPU use by 7% over + * relying only on callouts to drive HPTS, and also results in idle + * power savings on mostly idle servers. + * This was inspired by the paper "Soft Timers: Efficient Microsecond + * Software Timer Support for Network Processing" + * by Mohit Aron and Peter Druschel. */ - tcp_run_hpts(); -#endif + tcp_hpts_softclock(); /* * Let the scheduler adjust our priority etc. */ diff --git a/sys/netinet/tcp_hpts.h b/sys/netinet/tcp_hpts.h index 8ca21daf60de..7eb1b2e08cb4 100644 --- a/sys/netinet/tcp_hpts.h +++ b/sys/netinet/tcp_hpts.h @@ -152,7 +152,6 @@ void __tcp_set_hpts(struct tcpcb *tp, int32_t line); void tcp_set_inp_to_drop(struct inpcb *inp, uint16_t reason); -extern void (*tcp_hpts_softclock)(void); void tcp_lro_hpts_init(void); extern int32_t tcp_min_hptsi_time; diff --git a/sys/netinet/tcp_lro.c b/sys/netinet/tcp_lro.c index 255e543ae21d..921d28f82517 100644 --- a/sys/netinet/tcp_lro.c +++ b/sys/netinet/tcp_lro.c @@ -89,7 +89,6 @@ SYSCTL_NODE(_net_inet_tcp, OID_AUTO, lro, CTLFLAG_RW | CTLFLAG_MPSAFE, 0, long tcplro_stacks_wanting_mbufq; int (*tcp_lro_flush_tcphpts)(struct lro_ctrl *lc, struct lro_entry *le); -void (*tcp_hpts_softclock)(void); counter_u64_t tcp_inp_lro_direct_queue; counter_u64_t tcp_inp_lro_wokeup_queue; @@ -1262,8 +1261,7 @@ tcp_lro_flush_all(struct lro_ctrl *lc) done: /* flush active streams */ tcp_lro_rx_done(lc); - if (tcp_hpts_softclock != NULL) - tcp_hpts_softclock(); + tcp_hpts_softclock(); lc->lro_mbuf_count = 0; } diff --git a/sys/sys/systm.h b/sys/sys/systm.h index 2532bc3d9926..06d40481375f 100644 --- a/sys/sys/systm.h +++ b/sys/sys/systm.h @@ -378,6 +378,12 @@ void cpu_et_frequency(struct eventtimer *et, uint64_t newfreq); extern int cpu_disable_c2_sleep; extern int cpu_disable_c3_sleep; +extern void (*tcp_hpts_softclock)(void); +#define tcp_hpts_softclock() do { \ + if (tcp_hpts_softclock != NULL) \ + tcp_hpts_softclock(); \ +} while (0) + char *kern_getenv(const char *name); void freeenv(char *env); int getenv_int(const char *name, int *data);