From nobody Fri Jan 13 21:24:47 2023 X-Original-To: dev-commits-src-branches@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4NtvZS1CbJz2qkSb; Fri, 13 Jan 2023 21:24:48 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4NtvZR6VgLz40rT; Fri, 13 Jan 2023 21:24:47 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1673645087; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=kYT5OpYSmZwaT/aLvWT/Y7u4BUpi13UUrPqkvCOTR7o=; b=WGRnXdKrA8e6Os7MtW7c2PpPbdZMZB/f4+ZU8lSKTgZTmUsgAZtIIV3BUatpRlbf2wnzAf 1ebUuyPHuMENGMI/tlRtWlApiu26i52JEsSgHXTrRhMPbXk0/9jsjJxs0+hpvjFD2EiHgb nhQyoDO7fNBUYnym0kHj5Y9Ux/uw6y9d3+/DAuTXVeaNKNh7WEU1FbdhgA+8MoXpqygSTI uRyg3RbnoIrpd87IRWE2eihbkF/TdCn7JvtxFnn6gknaClWbT2ygO2RoQchg8Rl/xx5DOr Bb+ptjttL67uLNHftlUVWndf2DYoA+5287MsbTBHLUYgZA1+boBe98Pl5Qj4ow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1673645087; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=kYT5OpYSmZwaT/aLvWT/Y7u4BUpi13UUrPqkvCOTR7o=; b=VzMGOt1a/fXvyJOpYSnGNkm0neg4+cinM/JIhmgbL6A7ofLJWIGMrJamHgwVAvz3fsTUT8 XtzS/5VxjYMt8nWO5pv9j6a4EgqjFZ7C5KWj8XKYmXM2FbiHlQXCk3v9s3Yv9kFlMvVKWI 9bZwffbWd9+YZZn33QdwehmFcWT6Fr9kqbeWFNHg4zLtxeLcUdOeauQM/fk7a+wyzVYm+N T+JwsdDhoxZcHI1IYsNs1S/c/lPEJQkZrU3PY81UKm1HYyGzlTw5ErhbAwYcr6dQ1hEBvw JdwzdMZomkDoXryOCTxVqTmT8Mnu5WA+t3i55iagDFMtKT8L8rdBvJEJzVmG8Q== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1673645087; a=rsa-sha256; cv=none; b=glBwIc3xI0jnr4Igull0EqFSMZqwEstKx7Yq/y3VKwDr1QB2cE9YD5sOfZZ2b1gqfZVTqi 7+PhVz4Y/ow9G9nXN7oatLZECRoBIO5Lq9t1w/nB66JCl8qBiHPlR2z7iAnX3av3kXx8QS Dk74Ku5EsctkKoTDUH6L7OhIUQ9u22QcRdziEQusUzEaeOlN1JwYZYzB+u7/MvtE0nqqWc k+m9usovasS3eiE+Xn9VmfNZZxugYMh03qk1Bbrm3XW21B3iyYZ7kQ5vH9NoSPbODs/1Al vy1kQErmvrXWBdyhdd8godNC6WrxW3B7Pse5Gbs2Ax21Nk2DFoQBtUYfrK+7Gg== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4NtvZR4TMwzMVt; Fri, 13 Jan 2023 21:24:47 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 30DLOleM040705; Fri, 13 Jan 2023 21:24:47 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 30DLOl4Y040704; Fri, 13 Jan 2023 21:24:47 GMT (envelope-from git) Date: Fri, 13 Jan 2023 21:24:47 GMT Message-Id: <202301132124.30DLOl4Y040704@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-branches@FreeBSD.org From: "Alexander V. Chernikov" Subject: git: e8508a64b7d2 - stable/13 - routing: actually sort nexthops in nhgs by their index List-Id: Commits to the stable branches of the FreeBSD src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-branches List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-branches@freebsd.org X-BeenThere: dev-commits-src-branches@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: melifaro X-Git-Repository: src X-Git-Refname: refs/heads/stable/13 X-Git-Reftype: branch X-Git-Commit: e8508a64b7d25cd189636fe5a7eecabe27f6c6a9 Auto-Submitted: auto-generated X-ThisMailContainsUnwantedMimeParts: N The branch stable/13 has been updated by melifaro: URL: https://cgit.FreeBSD.org/src/commit/?id=e8508a64b7d25cd189636fe5a7eecabe27f6c6a9 commit e8508a64b7d25cd189636fe5a7eecabe27f6c6a9 Author: Alexander V. Chernikov AuthorDate: 2022-06-27 17:19:50 +0000 Commit: Alexander V. Chernikov CommitDate: 2023-01-13 21:18:25 +0000 routing: actually sort nexthops in nhgs by their index Nexthops in the nexthop groups needs to be deterministically sorted by some their property to simplify reporting cost when changing large nexthop groups. Fix reporting by actually sorting next hops by their indices (`wn_cmp_idx()`). As calc_min_mpath_slots_fast() has an assumption that next hops are sorted using their relative weight in the nexthop groups, it needs to be addressed as well. The latter sorting is required to quickly determine the layout of the next hops in the actual forwarding group. For example, what's the best way to split the traffic between nhops with weights 19,31 and 47 if the maximum nexthop group width is 64? It is worth mentioning that such sorting is only required during nexthop group creation and is not used elsewhere. Lastly, normally all nexthop are of the same weight. With that in mind, (a) use spare 32 bytes inside `struct weightened_nexthop` to avoid another memory allocation and (b) use insertion sort to sort the nexthop weights. Reported by: thj Tested by: Claudio Jeker Differential Revision: https://reviews.freebsd.org/D35599 MFC after: 2 weeks (cherry picked from commit 76f1ab8eff9ede509906e539c10373db44528690) --- sys/net/route/nhgrp_ctl.c | 77 ++++++++++++++++++++++++++++++----------------- sys/net/route/nhop.h | 1 + 2 files changed, 50 insertions(+), 28 deletions(-) diff --git a/sys/net/route/nhgrp_ctl.c b/sys/net/route/nhgrp_ctl.c index ee3ac9bec3d6..bad237a334ef 100644 --- a/sys/net/route/nhgrp_ctl.c +++ b/sys/net/route/nhgrp_ctl.c @@ -76,7 +76,7 @@ CHK_STRUCT_FIELD_GENERIC(struct nhop_object, nh_flags, struct nhgrp_object, nhg_ /* Cap multipath to 64, as the larger values would break rib_cmd_info bmasks */ CTASSERT(RIB_MAX_MPATH_WIDTH <= 64); -static int wn_cmp(const void *a, const void *b); +static int wn_cmp_idx(const void *a, const void *b); static void sort_weightened_nhops(struct weightened_nhop *wn, int num_nhops); static struct nhgrp_priv *get_nhgrp(struct nh_control *ctl, @@ -86,40 +86,50 @@ static void destroy_nhgrp_epoch(epoch_context_t ctx); static void free_nhgrp_nhops(struct nhgrp_priv *nhg_priv); static int -wn_cmp(const void *a, const void *b) +wn_cmp_idx(const void *a, const void *b) { - const struct weightened_nhop *wa = a; - const struct weightened_nhop *wb = b; + const struct weightened_nhop *w_a = a; + const struct weightened_nhop *w_b = b; + uint32_t a_idx = w_a->nh->nh_priv->nh_idx; + uint32_t b_idx = w_b->nh->nh_priv->nh_idx; - if (wa->weight > wb->weight) - return (1); - else if (wa->weight < wb->weight) + if (a_idx < b_idx) return (-1); - - /* Compare nexthops by pointer */ - if (wa->nh > wb->nh) + else if (a_idx > b_idx) return (1); - else if (wa->nh < wb->nh) - return (-1); else return (0); } /* * Perform in-place sorting for array of nexthops in @wn. - * - * To avoid nh groups duplication, nexthops/weights in the - * @wn need to be ordered deterministically. - * As this sorting is needed only for the control plane functionality, - * there are no specific external requirements. - * - * Sort by weight first, to ease calculation of the slot sizes. + * Sort by nexthop index ascending. */ static void sort_weightened_nhops(struct weightened_nhop *wn, int num_nhops) { - qsort(wn, num_nhops, sizeof(struct weightened_nhop), wn_cmp); + qsort(wn, num_nhops, sizeof(struct weightened_nhop), wn_cmp_idx); +} + +/* + * In order to determine the minimum weight difference in the array + * of weights, create a sorted array of weights, using spare "storage" + * field in the `struct weightened_nhop`. + * Assume weights to be (mostly) the same and use insertion sort to + * make it sorted. + */ +static void +sort_weightened_nhops_weights(struct weightened_nhop *wn, int num_items) +{ + wn[0].storage = wn[0].weight; + for (int i = 1, j = 0; i < num_items; i++) { + uint32_t weight = wn[i].weight; // read from 'weight' as it's not reordered + /* Move all weights > weight 1 position right */ + for (j = i - 1; j >= 0 && wn[j].storage > weight; j--) + wn[j + 1].storage = wn[j].storage; + wn[j + 1].storage = weight; + } } /* @@ -136,19 +146,26 @@ sort_weightened_nhops(struct weightened_nhop *wn, int num_nhops) * (1, 100), (2, 200), (3, 400) -> 7 slots [1, 2, 2, 3, 3, 3] */ static uint32_t -calc_min_mpath_slots_fast(const struct weightened_nhop *wn, size_t num_items) +calc_min_mpath_slots_fast(struct weightened_nhop *wn, size_t num_items, + uint64_t *ptotal) { uint32_t i, last, xmin; uint64_t total = 0; + // Get sorted array of weights in .storage field + sort_weightened_nhops_weights(wn, num_items); + last = 0; - xmin = wn[0].weight; + xmin = wn[0].storage; for (i = 0; i < num_items; i++) { - total += wn[i].weight; - if ((wn[i].weight - last < xmin) && (wn[i].weight != last)) - xmin = wn[i].weight - last; - last = wn[i].weight; + total += wn[i].storage; + if ((wn[i].storage != last) && + ((wn[i].storage - last < xmin) || xmin == 0)) { + xmin = wn[i].storage - last; + } + last = wn[i].storage; } + *ptotal = total; /* xmin is the minimum unit of desired capacity */ if ((total % xmin) != 0) return (0); @@ -170,11 +187,14 @@ calc_min_mpath_slots_fast(const struct weightened_nhop *wn, size_t num_items) * RIB_MAX_MPATH_WIDTH in case of any failure. */ static uint32_t -calc_min_mpath_slots(const struct weightened_nhop *wn, size_t num_items) +calc_min_mpath_slots(struct weightened_nhop *wn, size_t num_items) { uint32_t v; + uint64_t total; - v = calc_min_mpath_slots_fast(wn, num_items); + v = calc_min_mpath_slots_fast(wn, num_items, &total); + if (total == 0) + return (0); if ((v == 0) || (v > RIB_MAX_MPATH_WIDTH)) v = RIB_MAX_MPATH_WIDTH; @@ -364,6 +384,7 @@ nhgrp_free(struct nhgrp_object *nhg) } NET_EPOCH_EXIT(et); + KASSERT((nhg_priv->nhg_idx == 0), ("gr_idx != 0")); epoch_call(net_epoch_preempt, destroy_nhgrp_epoch, &nhg_priv->nhg_epoch_ctx); } diff --git a/sys/net/route/nhop.h b/sys/net/route/nhop.h index 1d37a84b68b4..12bbe163788f 100644 --- a/sys/net/route/nhop.h +++ b/sys/net/route/nhop.h @@ -166,6 +166,7 @@ struct nhop_object { struct weightened_nhop { struct nhop_object *nh; uint32_t weight; + uint32_t storage; }; void nhop_free(struct nhop_object *nh);