git: 9882b8313200 - stable/12 - netmap: iflib: stop krings during interface reset

Vincenzo Maffione vmaffione at FreeBSD.org
Sun Jan 17 13:34:40 UTC 2021


The branch stable/12 has been updated by vmaffione:

URL: https://cgit.FreeBSD.org/src/commit/?id=9882b83132007acc0e6cef6d248fb12cdebba278

commit 9882b83132007acc0e6cef6d248fb12cdebba278
Author:     Vincenzo Maffione <vmaffione at FreeBSD.org>
AuthorDate: 2021-01-09 20:54:11 +0000
Commit:     Vincenzo Maffione <vmaffione at FreeBSD.org>
CommitDate: 2021-01-17 12:02:44 +0000

    netmap: iflib: stop krings during interface reset
    
    When different processes open separate subsets of the
    available rings of a same netmap interface, a device
    reset may be performed while one of the processes
    is actively using some rings (e.g., caused by another
    process executing a nmport_open()).
    With this patch, such situation will cause the
    active process to get a POLLERR, so that it can
    have a chance to detect the situation.
    We also guarantee that no process is running a txsync
    or rxsync (ioctl or poll) while an iflib device reset
    is in progress.
    
    PR:         252453
    MFC after:  1 week
    
    (cherry picked from commit 1d238b07d5d4d9660ae0e08daede6da7e91c7853)
    
    netmap: iflib: fix asserts in netmap_fl_refill()
    
    When netmap_fl_refill() is called at initialization time (e.g.,
    during netmap_iflib_register()), nic_i must be 0, since the
    free list is reinitialized. At the end of the refill cycle, nic_i
    must still be zero, because exactly N descriptors (N is the ring size)
    are refilled.
    This patch therefore fixes the assertions to check on nic_i rather
    than on nm_i. The current netmap_reset() may in fact cause nm_i
    to be != 0 while the device is resetting: this may happen when
    multiple non-cooperating processes open different subsets of the
    available netmap rings.
    
    PR:         252518
    MFC after:  1 week
    
    (cherry picked from commit 3189ba61673af5bd3ed2ee9e33be92bc0b9995ba)
    
    netmap: refactor netmap_reset
    
    The netmap_reset() function is meant to be called by the driver
    when they initialize (or re-initialize) a hardware ring.
    However, since the introduction of support for opening (in
    netmap mode) a subset of the available rings, netmap_reset()
    may be called multiple times on actively used rings, causing
    both kring and netmap ring to transition to an inconsistent
    state.
    This changes improves the situation by resetting all the
    indices fields of the kring to 0, as expected after the
    reinitialization of a hardware ring.
    
    PR:         252518
    MFC after:  1 week
    
    (cherry picked from commit 7ba6ecf216fb15e8b147db268b91d9b82c5a0682)
    
    netmap: iflib: enable/disable krings on any interface reinit
    
    Since 1d238b07d5d4d9660ae0e0, krings are disabled before
    a reinit cycle triggered by iflib_netmap_register.
    However, this operation is actually necessary also for
    any interface reinit triggered by other causes (i.e.,
    ifconfig commands).
    We achieve this goal by moving the krings enable/disable
    operation inside iflib_stop() and iflib_init_locked().
    
    Once here, this change also removes some redundant operations
    from iflib_netmap_register(), that are already performed by
    iflib_stop().
    
    PR:             252453
    MFC after:      1 week
    
    (cherry picked from commit 3d65fd97e85ab807f3baa62aa979ae527914a3ea)
    
    iflib: fix build failure in case DEV_NETMAP is not defined
    
    This addresses the build failure introduced by
    3d65fd97e85ab807f3baa62.
    
    MFC with: 3d65fd97e85ab807f3baa62
    
    (cherry picked from commit 8aa8484cbf06bb26ad87177fe4aebcaf1519f138)
    
    netmap: restore hwofs and support it in iflib
    
    Restore the hwofs functionality temporarily disabled by
    7ba6ecf216fb15e8b147db2 to prevent issues with iflib.
    This patch brings the necessary changes to iflib to
    enable howfs to allow interface restarts without
    disrupting netmap applications actively using its
    rings.
    After this change, it becomes possible for multiple
    non-cooperating netmap applications to use non-overlapping
    subsets of the available netmap rings without clashing
    with each other.
    
    PR:             252453
    MFC after:      1 week
    
    (cherry picked from commit 55f0ad5fdee1a779d6889481ba591a819081b9ca)
---
 sys/dev/netmap/netmap.c | 94 +++++++++++++++++++++++--------------------------
 sys/net/iflib.c         | 67 +++++++++++++++++++++++++----------
 2 files changed, 94 insertions(+), 67 deletions(-)

diff --git a/sys/dev/netmap/netmap.c b/sys/dev/netmap/netmap.c
index 982f76d45faa..075c27b7a597 100644
--- a/sys/dev/netmap/netmap.c
+++ b/sys/dev/netmap/netmap.c
@@ -633,7 +633,7 @@ void
 netmap_disable_all_rings(struct ifnet *ifp)
 {
 	if (NM_NA_VALID(ifp)) {
-		netmap_set_all_rings(NA(ifp), NM_KR_STOPPED);
+		netmap_set_all_rings(NA(ifp), NM_KR_LOCKED);
 	}
 }
 
@@ -4056,75 +4056,71 @@ done:
 
 
 /*
- * netmap_reset() is called by the driver routines when reinitializing
- * a ring. The driver is in charge of locking to protect the kring.
- * If native netmap mode is not set just return NULL.
- * If native netmap mode is set, in particular, we have to set nr_mode to
- * NKR_NETMAP_ON.
+ * Reset function to be called by the driver routines when reinitializing
+ * a hardware ring. The driver is in charge of locking to protect the kring
+ * while this operation is being performed. This is normally achieved by
+ * calling netmap_disable_all_rings() before triggering a reset.
+ * If the kring is not in netmap mode, return NULL to inform the caller
+ * that this is the case.
+ * If the kring is in netmap mode, set hwofs so that the netmap indices
+ * seen by userspace (head/cut/tail) do not change, although the internal
+ * NIC indices have been reset to 0.
+ * In any case, adjust kring->nr_mode.
  */
 struct netmap_slot *
 netmap_reset(struct netmap_adapter *na, enum txrx tx, u_int n,
 	u_int new_cur)
 {
 	struct netmap_kring *kring;
-	int new_hwofs, lim;
+	u_int new_hwtail, new_hwofs;
 
 	if (!nm_native_on(na)) {
 		nm_prdis("interface not in native netmap mode");
 		return NULL;	/* nothing to reinitialize */
 	}
 
-	/* XXX note- in the new scheme, we are not guaranteed to be
-	 * under lock (e.g. when called on a device reset).
-	 * In this case, we should set a flag and do not trust too
-	 * much the values. In practice: TODO
-	 * - set a RESET flag somewhere in the kring
-	 * - do the processing in a conservative way
-	 * - let the *sync() fixup at the end.
-	 */
 	if (tx == NR_TX) {
 		if (n >= na->num_tx_rings)
 			return NULL;
-
 		kring = na->tx_rings[n];
-
-		if (kring->nr_pending_mode == NKR_NETMAP_OFF) {
-			kring->nr_mode = NKR_NETMAP_OFF;
-			return NULL;
-		}
-
-		// XXX check whether we should use hwcur or rcur
-		new_hwofs = kring->nr_hwcur - new_cur;
+		/*
+		 * Set hwofs to rhead, so that slots[rhead] is mapped to
+		 * the NIC internal slot 0, and thus the netmap buffer
+		 * at rhead is the next to be transmitted. Transmissions
+		 * that were pending before the reset are considered as
+		 * sent, so that we can have hwcur = rhead. All the slots
+		 * are now owned by the user, so we can also reinit hwtail.
+		 */
+		new_hwofs = kring->rhead;
+		new_hwtail = nm_prev(kring->rhead, kring->nkr_num_slots - 1);
 	} else {
 		if (n >= na->num_rx_rings)
 			return NULL;
 		kring = na->rx_rings[n];
-
-		if (kring->nr_pending_mode == NKR_NETMAP_OFF) {
-			kring->nr_mode = NKR_NETMAP_OFF;
-			return NULL;
-		}
-
-		new_hwofs = kring->nr_hwtail - new_cur;
+		/*
+		 * Set hwofs to hwtail, so that slots[hwtail] is mapped to
+		 * the NIC internal slot 0, and thus the netmap buffer
+		 * at hwtail is the next to be given to the NIC.
+		 * Unread slots (the ones in [rhead,hwtail[) are owned by
+		 * the user, and thus the caller cannot give them
+		 * to the NIC right now.
+		 */
+		new_hwofs = kring->nr_hwtail;
+		new_hwtail = kring->nr_hwtail;
 	}
-	lim = kring->nkr_num_slots - 1;
-	if (new_hwofs > lim)
-		new_hwofs -= lim + 1;
-
-	/* Always set the new offset value and realign the ring. */
-	if (netmap_debug & NM_DEBUG_ON)
-	    nm_prinf("%s %s%d hwofs %d -> %d, hwtail %d -> %d",
-		na->name,
-		tx == NR_TX ? "TX" : "RX", n,
-		kring->nkr_hwofs, new_hwofs,
-		kring->nr_hwtail,
-		tx == NR_TX ? lim : kring->nr_hwtail);
-	kring->nkr_hwofs = new_hwofs;
-	if (tx == NR_TX) {
-		kring->nr_hwtail = kring->nr_hwcur + lim;
-		if (kring->nr_hwtail > lim)
-			kring->nr_hwtail -= lim + 1;
+	if (kring->nr_pending_mode == NKR_NETMAP_OFF) {
+		kring->nr_mode = NKR_NETMAP_OFF;
+		return NULL;
 	}
+	if (netmap_verbose) {
+	    nm_prinf("%s, hc %u->%u, ht %u->%u, ho %u->%u", kring->name,
+	        kring->nr_hwcur, kring->rhead,
+	        kring->nr_hwtail, new_hwtail,
+		kring->nkr_hwofs, new_hwofs);
+	}
+	kring->nr_hwcur = kring->rhead;
+	kring->nr_hwtail = new_hwtail;
+	kring->nkr_hwofs = new_hwofs;
 
 	/*
 	 * Wakeup on the individual and global selwait
@@ -4226,7 +4222,7 @@ nm_set_native_flags(struct netmap_adapter *na)
 	struct ifnet *ifp = na->ifp;
 
 	/* We do the setup for intercepting packets only if we are the
-	 * first user of this adapapter. */
+	 * first user of this adapter. */
 	if (na->active_fds > 0) {
 		return;
 	}
diff --git a/sys/net/iflib.c b/sys/net/iflib.c
index 3e115e992e4a..1653fd31a7f7 100644
--- a/sys/net/iflib.c
+++ b/sys/net/iflib.c
@@ -807,11 +807,6 @@ iflib_netmap_register(struct netmap_adapter *na, int onoff)
 	int status;
 
 	CTX_LOCK(ctx);
-	IFDI_INTR_DISABLE(ctx);
-
-	/* Tell the stack that the interface is no longer active */
-	ifp->if_drv_flags &= ~(IFF_DRV_RUNNING | IFF_DRV_OACTIVE);
-
 	if (!CTX_IS_VF(ctx))
 		IFDI_CRCSTRIP_SET(ctx, onoff, iflib_crcstrip);
 
@@ -820,7 +815,10 @@ iflib_netmap_register(struct netmap_adapter *na, int onoff)
 	/*
 	 * Enable (or disable) netmap flags, and intercept (or restore)
 	 * ifp->if_transmit. This is done once the device has been stopped
-	 * to prevent race conditions.
+	 * to prevent race conditions. Also, this must be done after
+	 * calling netmap_disable_all_rings() and before calling
+	 * netmap_enable_all_rings(), so that these two functions see the
+	 * updated state of the NAF_NETMAP_ON bit.
 	 */
 	if (onoff) {
 		nm_set_native_flags(na);
@@ -842,13 +840,13 @@ netmap_fl_refill(iflib_rxq_t rxq, struct netmap_kring *kring, bool init)
 {
 	struct netmap_adapter *na = kring->na;
 	u_int const lim = kring->nkr_num_slots - 1;
-	u_int nm_i = kring->nr_hwcur;
 	struct netmap_ring *ring = kring->ring;
 	bus_dmamap_t *map;
 	struct if_rxd_update iru;
 	if_ctx_t ctx = rxq->ifr_ctx;
 	iflib_fl_t fl = &rxq->ifr_fl[0];
 	u_int nic_i_first, nic_i;
+	u_int nm_i;
 	int i, n;
 #if IFLIB_DEBUG_COUNTERS
 	int rf_count = 0;
@@ -857,30 +855,42 @@ netmap_fl_refill(iflib_rxq_t rxq, struct netmap_kring *kring, bool init)
 	/*
 	 * This function is used both at initialization and in rxsync.
 	 * At initialization we need to prepare (with isc_rxd_refill())
-	 * all the (N) netmap buffers in the ring, in such a way to keep
-	 * fl->ifl_pidx and kring->nr_hwcur in sync (except for
-	 * kring->nkr_hwofs); at rxsync time, both indexes point to the
-	 * next buffer to be refilled.
+	 * all the netmap buffers currently owned by the kernel, in
+	 * such a way to keep fl->ifl_pidx and kring->nr_hwcur in sync
+	 * (except for kring->nkr_hwofs). These may be less than
+	 * kring->nkr_num_slots if netmap_reset() was called while
+	 * an application using the kring that still owned some
+	 * buffers.
+	 * At rxsync time, both indexes point to the next buffer to be
+	 * refilled.
 	 * In any case we publish (with isc_rxd_flush()) up to
 	 * (fl->ifl_pidx - 1) % N (included), to avoid the NIC tail/prod
 	 * pointer to overrun the head/cons pointer, although this is
 	 * not necessary for some NICs (e.g. vmx).
 	 */
-	if (__predict_false(init))
-		n = kring->nkr_num_slots;
-	else {
-		n = kring->rhead - nm_i;
+	if (__predict_false(init)) {
+		n = kring->nkr_num_slots - nm_kr_rxspace(kring);
+	} else {
+		n = kring->rhead - kring->nr_hwcur;
 		if (n == 0)
 			return (0); /* Nothing to do. */
 		if (n < 0)
 			n += kring->nkr_num_slots;
 	}
 
-	/* Start to refill from nr_hwcur, publishing n buffers. */
 	iru_init(&iru, rxq, 0 /* flid */);
 	map = fl->ifl_sds.ifsd_map;
 	nic_i = fl->ifl_pidx;
-	MPASS(nic_i == netmap_idx_k2n(kring, nm_i));
+	nm_i = netmap_idx_n2k(kring, nic_i);
+	if (__predict_false(init)) {
+		/*
+		 * On init/reset, nic_i must be 0, and we must
+		 * start to refill from hwtail (see netmap_reset()).
+		 */
+		MPASS(nic_i == 0);
+		MPASS(nm_i == kring->nr_hwtail);
+	} else
+		MPASS(nm_i == kring->nr_hwcur);
 	DBG_COUNTER_INC(fl_refills);
 	while (n > 0) {
 #if IFLIB_DEBUG_COUNTERS
@@ -920,7 +930,10 @@ netmap_fl_refill(iflib_rxq_t rxq, struct netmap_kring *kring, bool init)
 		ctx->isc_rxd_refill(ctx->ifc_softc, &iru);
 	}
 	fl->ifl_pidx = nic_i;
-	MPASS(!init || nm_i == 0);
+	/*
+	 * At the end of the loop we must have refilled everything
+	 * we could possibly refill.
+	 */
 	MPASS(nm_i == kring->rhead);
 	kring->nr_hwcur = nm_i;
 
@@ -1302,6 +1315,8 @@ iflib_netmap_timer(void *arg)
 #define iflib_netmap_txq_init(ctx, txq) (0)
 #define iflib_netmap_rxq_init(ctx, rxq) (0)
 #define iflib_netmap_detach(ifp)
+#define netmap_enable_all_rings(ifp)
+#define netmap_disable_all_rings(ifp)
 
 #define iflib_netmap_attach(ctx) (0)
 #define netmap_rx_irq(ifp, qid, budget) (0)
@@ -2452,6 +2467,12 @@ iflib_init_locked(if_ctx_t ctx)
 	if_setdrvflagbits(ifp, IFF_DRV_OACTIVE, IFF_DRV_RUNNING);
 	IFDI_INTR_DISABLE(ctx);
 
+	/*
+	 * See iflib_stop(). Useful in case iflib_init_locked() is
+	 * called without first calling iflib_stop().
+	 */
+	netmap_disable_all_rings(ifp);
+
 	tx_ip_csum_flags = scctx->isc_tx_csum_flags & (CSUM_IP | CSUM_TCP | CSUM_UDP | CSUM_SCTP);
 	tx_ip6_csum_flags = scctx->isc_tx_csum_flags & (CSUM_IP6_TCP | CSUM_IP6_UDP | CSUM_IP6_SCTP);
 	/* Set hardware offload abilities */
@@ -2508,6 +2529,9 @@ done:
 	for (i = 0; i < sctx->isc_ntxqsets; i++, txq++)
 		callout_reset_on(&txq->ift_timer, hz/2, iflib_timer, txq,
 			txq->ift_timer.c_cpu);
+
+        /* Re-enable txsync/rxsync. */
+	netmap_enable_all_rings(ifp);
 }
 
 static int
@@ -2553,6 +2577,13 @@ iflib_stop(if_ctx_t ctx)
 	IFDI_STOP(ctx);
 	DELAY(1000);
 
+	/*
+	 * Stop any pending txsync/rxsync and prevent new ones
+	 * form starting. Processes blocked in poll() will get
+	 * POLLERR.
+	 */
+	netmap_disable_all_rings(ctx->ifc_ifp);
+
 	iflib_debug_reset();
 	/* Wait for current tx queue users to exit to disarm watchdog timer. */
 	for (i = 0; i < scctx->isc_ntxqsets; i++, txq++) {


More information about the dev-commits-src-all mailing list