[REVIEW/TEST] polling(4) changes

Gleb Smirnoff glebius at FreeBSD.org
Fri Sep 30 05:40:04 PDT 2005


  [please, follow-up on net@ only]

  Colleagues,

  here are some patches for review.

  Problems addressed:

  1) When Giant was removed from polling a problem was introduced. The idle
poll feature was broken. The idle poll thread can enter polling handler on
one interface and put to sleep for a long time, until CPU resources found.
During this time no traffic is received on interface. Well, this is what
idle thread is supposed to do. Why didn't this happen with Giant? Because
idle poll entered poll handler holding Giant, and other threads (in particular
netisr poll) contested on Giant and propagated their priority to idle poll.
Well, this is a hack, but idle poll significantly improves polling performance
on an idle box, that's why I won't axe it but try to fix it.

  To address the problem we need to use the same technique as before, but use
poll_mtx instead of Giant. However, this will resurrect LORs, that were fixed
with Giant removal. The alternative lock path happens, when driver decides
to deregister from polling itself. The LOR is fixed by further changes. See 3).

  2) Drivers indicate their ability to do polling(4) with IFCAP_POLLING flag
int if_capabilites field. Setting the flag in if_capenable should register
interface with polling and disable interrupts. However, the if_flags is also
abused with IFF_POLLING flag. The aim is to remove IFF_POLLING flag.

  3) The polling is switched on and off not functionally. That is, when you
say 'sysctl kern.polling.enable=1' or 'ifconfig fxp0 -polling', the polling
is switched on/off not immediately but on next tick or next interrupt. This
non-functional approach leads to a lot of ambiguouties in code, makes it
harder to understand and maintain. It also exposes race conditions.

The attached patch removes:
  - IFF_POLLING flag.
  - Use of if_flags, if_drv_flags, if_capenable from kern_poll.c.
    All current accesses to these fields are not locked and polling shouldn't
    look there.
  - poll in trap feature. Sorry, we can't acquire mutexes in trap(). Anyone
    used it, anyway?
  - POLL_DEREGISTER command. No hacks. Everything is done functionally via
    ioctl().

The new world order for driver is the following:

  1) Declare IFCAP_POLLING in if_capabilities on attach. Do not touch if_capenable.
  2) in ioctl method, in SIOCSIFCAP case the driver should:
	- call ether_poll_[de]register
	- if no error, set the IFCAP_POLLING flag in if_capenable
	- obtain driver lock
	- [dis/en]able interrupts
	- drop driver lock
  3) In poll method, check IFF_DRV_RUNNING flag after obtaining driver lock
  4) In interrupt handler check IFCAP_POLLING flag in if_capenable. If present,
     then return. This is important to protect from spurious interrupts.
  5) In device detach method, call ether_poll_deregister() before obtaining
     driver lock.

The new world order for user is the following:

  - polling should be enabled and disabled with ifconfig(8)
  - kern.polling.enable is now deprecated. It is kept for some compatibility. When
    you set it to 1, polling is turned on for all capable interfaces. In case of 0
    polling is turned off.
  - no poll in trap

The attached patch touches only em(4) and fxp(4). I will write patches for all
other drivers ASAP. But I don't have all the hardware, so if you are using polling(4)
and you run FreeBSD 6 or 7, please help me with testing. ATM only em(4) driver patch
is tested.

-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE
-------------- next part --------------
Index: amd64/amd64/trap.c
===================================================================
RCS file: /home/ncvs/src/sys/amd64/amd64/trap.c,v
retrieving revision 1.293
diff -u -r1.293 trap.c
--- amd64/amd64/trap.c	28 Sep 2005 07:03:02 -0000	1.293
+++ amd64/amd64/trap.c	29 Sep 2005 09:41:32 -0000
@@ -146,11 +146,6 @@
 extern char *syscallnames[];
 #endif
 
-#ifdef DEVICE_POLLING
-extern u_int32_t poll_in_trap;
-extern int ether_poll(int count);
-#endif /* DEVICE_POLLING */
-
 /*
  * Exception, fault, and trap interface to the FreeBSD kernel.
  * This common code is called from assembly language IDT gate entry
@@ -241,11 +236,6 @@
 			trap_fatal(&frame, frame.tf_addr);
 	}
 
-#ifdef	DEVICE_POLLING
-	if (poll_in_trap)
-		ether_poll(poll_in_trap);
-#endif	/* DEVICE_POLLING */
-
         if (ISPL(frame.tf_cs) == SEL_UPL) {
 		/* user trap */
 
Index: dev/em/if_em.c
===================================================================
RCS file: /home/ncvs/src/sys/dev/em/if_em.c,v
retrieving revision 1.72
diff -u -r1.72 if_em.c
--- dev/em/if_em.c	20 Sep 2005 14:52:57 -0000	1.72
+++ dev/em/if_em.c	29 Sep 2005 13:26:42 -0000
@@ -197,6 +197,9 @@
 static void em_add_int_delay_sysctl(struct adapter *, const char *,
 				    const char *, struct em_int_delay_info *,
 				    int, int);
+#ifdef DEVICE_POLLING
+static poll_handler_t em_poll;
+#endif
 
 /*********************************************************************
  *  FreeBSD Device Interface Entry Points                    
@@ -526,6 +529,11 @@
 
 	INIT_DEBUGOUT("em_detach: begin");
 
+#ifdef DEVICE_POLLING
+	if (ifc->if_capenable & IFCAP_POLLING)
+		ether_poll_deregister(ifp);
+#endif
+
 	EM_LOCK(adapter);
 	adapter->in_detach = 1;
 	em_stop(adapter);
@@ -717,7 +725,7 @@
 				em_initialize_receive_unit(adapter);
 			}
 #ifdef DEVICE_POLLING
-                        if (!(ifp->if_flags & IFF_POLLING))
+                        if (!(ifp->if_capenable & IFCAP_POLLING))
 #endif
 				em_enable_intr(adapter);
 			EM_UNLOCK(adapter);
@@ -732,8 +740,26 @@
 		IOCTL_DEBUGOUT("ioctl rcv'd: SIOCSIFCAP (Set Capabilities)");
 		reinit = 0;
 		mask = ifr->ifr_reqcap ^ ifp->if_capenable;
-		if (mask & IFCAP_POLLING)
-			ifp->if_capenable ^= IFCAP_POLLING;
+#ifdef DEVICE_POLLING
+		if (mask & IFCAP_POLLING) {
+			if (ifr->ifr_reqcap & IFCAP_POLLING) {
+				error = ether_poll_register(em_poll, ifp);
+				if (error)
+					return(error);
+				EM_LOCK(adapter);
+				em_disable_intr(adapter);
+				ifp->if_capenable |= IFCAP_POLLING;
+				EM_UNLOCK(adapter);
+			} else {
+				error = ether_poll_deregister(ifp);
+				/* Enable interrupt even in error case */
+				EM_LOCK(adapter);
+				em_enable_intr(adapter);
+				ifp->if_capenable &= ~IFCAP_POLLING;
+				EM_UNLOCK(adapter);
+			}
+		}
+#endif
 		if (mask & IFCAP_HWCSUM) {
 			ifp->if_capenable ^= IFCAP_HWCSUM;
 			reinit = 1;
@@ -895,7 +921,7 @@
          * Only enable interrupts if we are not polling, make sure
          * they are off otherwise.
          */
-        if (ifp->if_flags & IFF_POLLING)
+        if (ifp->if_capenable & IFCAP_POLLING)
                 em_disable_intr(adapter);
         else
 #endif /* DEVICE_POLLING */
@@ -920,8 +946,6 @@
 
 
 #ifdef DEVICE_POLLING
-static poll_handler_t em_poll;
-        
 static void     
 em_poll_locked(struct ifnet *ifp, enum poll_cmd cmd, int count)
 {
@@ -930,14 +954,6 @@
 
 	mtx_assert(&adapter->mtx, MA_OWNED);
 
-	if (!(ifp->if_capenable & IFCAP_POLLING)) {
-		ether_poll_deregister(ifp);
-		cmd = POLL_DEREGISTER;
-	}
-        if (cmd == POLL_DEREGISTER) {       /* final call, enable interrupts */
-                em_enable_intr(adapter);
-                return;
-        }
         if (cmd == POLL_AND_CHECK_STATUS) {
                 reg_icr = E1000_READ_REG(&adapter->hw, ICR);
                 if (reg_icr & (E1000_ICR_RXSEQ | E1000_ICR_LSC)) {
@@ -948,13 +964,10 @@
 			callout_reset(&adapter->timer, hz, em_local_timer, adapter);
                 }
         }
-        if (ifp->if_drv_flags & IFF_DRV_RUNNING) {
-                em_process_receive_interrupts(adapter, count);
-                em_clean_transmit_interrupts(adapter);
-        }
+	em_process_receive_interrupts(adapter, count);
+	em_clean_transmit_interrupts(adapter);
 	
-        if (ifp->if_drv_flags & IFF_DRV_RUNNING &&
-	    !IFQ_DRV_IS_EMPTY(&ifp->if_snd))
+        if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd))
                 em_start_locked(ifp);
 }
         
@@ -964,7 +977,8 @@
         struct adapter *adapter = ifp->if_softc;
 
 	EM_LOCK(adapter);
-	em_poll_locked(ifp, cmd, count);
+	if (ifp->if_drv_flags & IFF_DRV_RUNNING)
+		em_poll_locked(ifp, cmd, count);
 	EM_UNLOCK(adapter);
 }
 #endif /* DEVICE_POLLING */
@@ -987,18 +1001,10 @@
         ifp = adapter->ifp;  
 
 #ifdef DEVICE_POLLING
-        if (ifp->if_flags & IFF_POLLING) {
+        if (ifp->if_capenable & IFCAP_POLLING) {
 		EM_UNLOCK(adapter);
                 return;
 	}
-
-	if ((ifp->if_capenable & IFCAP_POLLING) &&
-	    ether_poll_register(em_poll, ifp)) {
-                em_disable_intr(adapter);
-                em_poll_locked(ifp, 0, 1);
-		EM_UNLOCK(adapter);
-                return;
-        }
 #endif /* DEVICE_POLLING */
 
 	reg_icr = E1000_READ_REG(&adapter->hw, ICR);
@@ -1718,9 +1724,7 @@
 	mtx_assert(&adapter->mtx, MA_OWNED);
 
 	INIT_DEBUGOUT("em_stop: begin");
-#ifdef DEVICE_POLLING
-	ether_poll_deregister(ifp);
-#endif
+
 	em_disable_intr(adapter);
 	em_reset_hw(&adapter->hw);
 	callout_stop(&adapter->timer);
@@ -1976,7 +1980,6 @@
 
 #ifdef DEVICE_POLLING
 	ifp->if_capabilities |= IFCAP_POLLING;
-	ifp->if_capenable |= IFCAP_POLLING;
 #endif
 
 	/* 
@@ -2911,12 +2914,13 @@
 						       adapter->fmp = NULL);
  
                                 if (adapter->fmp != NULL) {
+					struct mbuf *m = adapter->fmp;
+                                	adapter->fmp = NULL;
 					EM_UNLOCK(adapter);
-                                        (*ifp->if_input)(ifp, adapter->fmp);
+                                        (*ifp->if_input)(ifp, m);
 					EM_LOCK(adapter);
 				}
 #endif
-                                adapter->fmp = NULL;
                                 adapter->lmp = NULL;
                         }
 		} else {
Index: dev/fxp/if_fxp.c
===================================================================
RCS file: /home/ncvs/src/sys/dev/fxp/if_fxp.c,v
retrieving revision 1.247
diff -u -r1.247 if_fxp.c
--- dev/fxp/if_fxp.c	27 Sep 2005 09:01:10 -0000	1.247
+++ dev/fxp/if_fxp.c	29 Sep 2005 13:28:26 -0000
@@ -773,7 +773,6 @@
 #ifdef DEVICE_POLLING
 	/* Inform the world we support polling. */
 	ifp->if_capabilities |= IFCAP_POLLING;
-	ifp->if_capenable |= IFCAP_POLLING;
 #endif
 
 	/*
@@ -891,6 +890,11 @@
 {
 	struct fxp_softc *sc = device_get_softc(dev);
 
+#ifdef DEVICE_POLLING
+	if (ifc->if_capenable & IFCAP_POLLING)   
+		ether_poll_deregister(sc->ifp);
+#endif
+
 	FXP_LOCK(sc);
 	sc->suspended = 1;	/* Do same thing as we do for suspend */
 	/*
@@ -1448,15 +1452,11 @@
 	uint8_t statack;
 
 	FXP_LOCK(sc);
-	if (!(ifp->if_capenable & IFCAP_POLLING)) {
-		ether_poll_deregister(ifp);
-		cmd = POLL_DEREGISTER;
-	}
-	if (cmd == POLL_DEREGISTER) {	/* final call, enable interrupts */
-		CSR_WRITE_1(sc, FXP_CSR_SCB_INTRCNTL, 0);
+	if (!(ifp->if_drv_flags & IFF_DRV_RUNNING)) {
 		FXP_UNLOCK(sc);
 		return;
 	}
+
 	statack = FXP_SCB_STATACK_CXTNO | FXP_SCB_STATACK_CNA |
 	    FXP_SCB_STATACK_FR;
 	if (cmd == POLL_AND_CHECK_STATUS) {
@@ -1495,18 +1495,10 @@
 	}
 
 #ifdef DEVICE_POLLING
-	if (ifp->if_flags & IFF_POLLING) {
+	if (ifp->if_capenable & IFCAP_POLLING) {
 		FXP_UNLOCK(sc);
 		return;
 	}
-	if ((ifp->if_capenable & IFCAP_POLLING) &&
-	    ether_poll_register(fxp_poll, ifp)) {
-		/* disable interrupts */
-		CSR_WRITE_1(sc, FXP_CSR_SCB_INTRCNTL, FXP_SCB_INTR_DISABLE);
-		FXP_UNLOCK(sc);
-		fxp_poll(ifp, 0, 1);
-		return;
-	}
 #endif
 	while ((statack = CSR_READ_1(sc, FXP_CSR_SCB_STATACK)) != 0) {
 		/*
@@ -1837,9 +1829,6 @@
 	ifp->if_drv_flags &= ~(IFF_DRV_RUNNING | IFF_DRV_OACTIVE);
 	ifp->if_timer = 0;
 
-#ifdef DEVICE_POLLING
-	ether_poll_deregister(ifp);
-#endif
 	/*
 	 * Cancel stats updater.
 	 */
@@ -2163,7 +2152,7 @@
 	 * ... but only do that if we are not polling. And because (presumably)
 	 * the default is interrupts on, we need to disable them explicitly!
 	 */
-	if ( ifp->if_flags & IFF_POLLING )
+	if (ifp->if_capenable & IFCAP_POLLING )
 		CSR_WRITE_1(sc, FXP_CSR_SCB_INTRCNTL, FXP_SCB_INTR_DISABLE);
 	else
 #endif /* DEVICE_POLLING */
@@ -2418,11 +2407,30 @@
 		break;
 
 	case SIOCSIFCAP:
-		FXP_LOCK(sc);
 		mask = ifp->if_capenable ^ ifr->ifr_reqcap;
-		if (mask & IFCAP_POLLING)
-			ifp->if_capenable ^= IFCAP_POLLING;
+#ifdef DEVICE_POLLING
+		if (mask & IFCAP_POLLING) {
+			if (ifr->ifr_reqcap & IFCAP_POLLING) {
+				error = ether_poll_register(fxp_poll, ifp);
+				if (error)
+					return(error);
+				FXP_LOCK(sc);
+				CSR_WRITE_1(sc, FXP_CSR_SCB_INTRCNTL,
+				    FXP_SCB_INTR_DISABLE);
+				ifp->if_capenable |= IFCAP_POLLING;
+				FXP_UNLOCK(sc);
+			} else {
+				error = ether_poll_deregister(ifp);
+				/* Enable interrupts in any case */
+				FXP_LOCK(sc);
+				CSR_WRITE_1(sc, FXP_CSR_SCB_INTRCNTL, 0);
+				ifp->if_capenable &= ~IFCAP_POLLING;
+				FXP_UNLOCK(sc);
+			}
+		}
+#endif
 		if (mask & IFCAP_VLAN_MTU) {
+			FXP_LOCK(sc);
 			ifp->if_capenable ^= IFCAP_VLAN_MTU;
 			if (sc->revision != FXP_REV_82557)
 				flag = FXP_FLAG_LONG_PKT_EN;
@@ -2431,8 +2439,8 @@
 			sc->flags ^= flag;
 			if (ifp->if_flags & IFF_UP)
 				fxp_init_body(sc);
+			FXP_UNLOCK(sc);
 		}
-		FXP_UNLOCK(sc);
 		break;
 
 	default:
Index: i386/i386/trap.c
===================================================================
RCS file: /home/ncvs/src/sys/i386/i386/trap.c,v
retrieving revision 1.280
diff -u -r1.280 trap.c
--- i386/i386/trap.c	28 Sep 2005 07:03:03 -0000	1.280
+++ i386/i386/trap.c	29 Sep 2005 09:42:21 -0000
@@ -160,11 +160,6 @@
 extern char *syscallnames[];
 #endif
 
-#ifdef DEVICE_POLLING
-extern u_int32_t poll_in_trap;
-extern int ether_poll(int count);
-#endif /* DEVICE_POLLING */
-
 /*
  * Exception, fault, and trap interface to the FreeBSD kernel.
  * This common code is called from assembly language IDT gate entry
@@ -272,11 +267,6 @@
 			trap_fatal(&frame, eva);
 	}
 
-#ifdef	DEVICE_POLLING
-	if (poll_in_trap)
-		ether_poll(poll_in_trap);
-#endif	/* DEVICE_POLLING */
-
         if ((ISPL(frame.tf_cs) == SEL_UPL) ||
 	    ((frame.tf_eflags & PSL_VM) && 
 		!(PCPU_GET(curpcb)->pcb_flags & PCB_VM86CALL))) {
Index: kern/kern_poll.c
===================================================================
RCS file: /home/ncvs/src/sys/kern/kern_poll.c,v
retrieving revision 1.22
diff -u -r1.22 kern_poll.c
--- kern/kern_poll.c	6 Sep 2005 11:09:18 -0000	1.22
+++ kern/kern_poll.c	29 Sep 2005 12:26:58 -0000
@@ -32,6 +32,7 @@
 #include <sys/systm.h>
 #include <sys/kernel.h>
 #include <sys/socket.h>			/* needed by net/if.h		*/
+#include <sys/sockio.h>
 #include <sys/sysctl.h>
 #include <sys/syslog.h>
 
@@ -44,14 +45,15 @@
 
 static void netisr_poll(void);		/* the two netisr handlers      */
 static void netisr_pollmore(void);
+static int poll_switch(SYSCTL_HANDLER_ARGS);
 
 void hardclock_device_poll(void);	/* hook from hardclock		*/
-void ether_poll(int);			/* polling while in trap	*/
+void ether_poll(int);			/* polling in idle loop		*/
 
 /*
  * Polling support for [network] device drivers.
  *
- * Drivers which support this feature try to register with the
+ * Drivers which support this feature can register with the
  * polling code.
  *
  * If registration is successful, the driver must disable interrupts,
@@ -64,10 +66,6 @@
  *  POLL_AND_CHECK_STATUS: as above, plus check status registers or do
  *	other more expensive operations. This command is issued periodically
  *	but less frequently than POLL_ONLY.
- *  POLL_DEREGISTER: deregister and return to interrupt mode.
- *
- * The first two commands are only issued if the interface is marked as
- * 'IFF_UP and IFF_DRV_RUNNING', the last one only if IFF_DRV_RUNNING is set.
  *
  * The count limit specifies how much work the handler can do during the
  * call -- typically this is the number of packets to be received, or
@@ -75,11 +73,9 @@
  * as the max time spent in the function grows roughly linearly with the
  * count).
  *
- * Deregistration can be requested by the driver itself (typically in the
- * *_stop() routine), or by the polling code, by invoking the handler.
- *
- * Polling can be globally enabled or disabled with the sysctl variable
- * kern.polling.enable (default is 0, disabled)
+ * Polling is enabled and disabled via setting IFCAP_POLLING flag on
+ * the interface. The driver ioctl handler should register interface
+ * with polling and disable interrupts, if registration was successfull.
  *
  * A second variable controls the sharing of CPU between polling/kernel
  * network processing, and other activities (typically userlevel tasks):
@@ -91,7 +87,7 @@
  * The following constraints hold
  *
  *	1 <= poll_each_burst <= poll_burst <= poll_burst_max
- *	0 <= poll_in_trap <= poll_each_burst
+ *	0 <= poll_each_burst
  *	MIN_POLL_BURST_MAX <= poll_burst_max <= MAX_POLL_BURST_MAX
  */
 
@@ -117,10 +113,6 @@
 SYSCTL_UINT(_kern_polling, OID_AUTO, idle_poll, CTLFLAG_RW,
 	&poll_in_idle_loop, 0, "Enable device polling in idle loop");
 
-u_int32_t poll_in_trap;			/* used in trap.c */
-SYSCTL_UINT(_kern_polling, OID_AUTO, poll_in_trap, CTLFLAG_RW,
-	&poll_in_trap, 0, "Poll burst size during a trap");
-
 static u_int32_t user_frac = 50;
 SYSCTL_UINT(_kern_polling, OID_AUTO, user_frac, CTLFLAG_RW,
 	&user_frac, 0, "Desired user fraction of cpu time");
@@ -149,9 +141,9 @@
 SYSCTL_UINT(_kern_polling, OID_AUTO, handlers, CTLFLAG_RD,
 	&poll_handlers, 0, "Number of registered poll handlers");
 
-static int polling = 0;		/* global polling enable */
-SYSCTL_UINT(_kern_polling, OID_AUTO, enable, CTLFLAG_RW,
-	&polling, 0, "Polling enabled");
+static int polling = 0;
+SYSCTL_PROC(_kern_polling, OID_AUTO, enable, CTLTYPE_UINT | CTLFLAG_RW,
+	0, sizeof(int), poll_switch, "I", "Switch polling for all interfaces");
 
 static u_int32_t phase;
 SYSCTL_UINT(_kern_polling, OID_AUTO, phase, CTLFLAG_RW,
@@ -174,23 +166,9 @@
 struct pollrec {
 	poll_handler_t	*handler;
 	struct ifnet	*ifp;
-	/*
-	 * Flags of polling record (protected by poll_mtx).
-	 * PRF_RUNNING means that the handler is now executing.
-	 * PRF_LEAVING means that the handler is now deregistering.
-	 */
-#define	PRF_RUNNING	0x1
-#define	PRF_LEAVING	0x2
-	uint32_t	flags;
 };
 
 static struct pollrec pr[POLL_LIST_LEN];
-
-#define	PR_VALID(i)	(pr[(i)].handler != NULL &&			\
-			!(pr[(i)].flags & (PRF_RUNNING|PRF_LEAVING)) &&	\
-			(pr[(i)].ifp->if_drv_flags & IFF_DRV_RUNNING) &&\
-			(pr[(i)].ifp->if_flags & IFF_UP))
-
 static struct mtx	poll_mtx;
 
 static void
@@ -258,30 +236,24 @@
 }
 
 /*
- * ether_poll is called from the idle loop or from the trap handler.
+ * ether_poll is called from the idle loop.
  */
 void
 ether_poll(int count)
 {
 	int i;
 
+	NET_LOCK_GIANT();
 	mtx_lock(&poll_mtx);
 
 	if (count > poll_each_burst)
 		count = poll_each_burst;
 
-	for (i = 0 ; i < poll_handlers ; i++) {
-		if (PR_VALID(i)) {
-			pr[i].flags |= PRF_RUNNING;
-			mtx_unlock(&poll_mtx);
-			NET_LOCK_GIANT();
-			pr[i].handler(pr[i].ifp, POLL_ONLY, count);
-			NET_UNLOCK_GIANT();
-			mtx_lock(&poll_mtx);
-			pr[i].flags &= ~PRF_RUNNING;
-		}
-	}
+	for (i = 0 ; i < poll_handlers ; i++)
+		pr[i].handler(pr[i].ifp, POLL_ONLY, count);
+
 	mtx_unlock(&poll_mtx);
+	NET_UNLOCK_GIANT();
 }
 
 /*
@@ -403,60 +375,29 @@
 		residual_burst : poll_each_burst;
 	residual_burst -= cycles;
 
-	if (polling) {
-		for (i = 0 ; i < poll_handlers ; i++) {
-			if (PR_VALID(i)) {
-				pr[i].flags |= PRF_RUNNING;
-				mtx_unlock(&poll_mtx);
-				pr[i].handler(pr[i].ifp, arg, cycles);
-				mtx_lock(&poll_mtx);
-				pr[i].flags &= ~PRF_RUNNING;
-			}
-		}
-	} else {	/* unregister */
-		for (i = 0 ; i < poll_handlers ; i++) {
-			if (pr[i].handler != NULL &&
-			    pr[i].ifp->if_drv_flags & IFF_DRV_RUNNING) {
-				pr[i].ifp->if_flags &= ~IFF_POLLING;
-				pr[i].flags |= PRF_LEAVING;
-				mtx_unlock(&poll_mtx);
-				pr[i].handler(pr[i].ifp, POLL_DEREGISTER, 1);
-				mtx_lock(&poll_mtx);
-				pr[i].flags &= ~PRF_LEAVING;
-			}
-			pr[i].handler = NULL;
-		}
-		residual_burst = 0;
-		poll_handlers = 0;
-	}
+	for (i = 0 ; i < poll_handlers ; i++)
+		pr[i].handler(pr[i].ifp, arg, cycles);
 
 	phase = 4;
 	mtx_unlock(&poll_mtx);
 }
 
 /*
- * Try to register routine for polling. Returns 1 if successful
- * (and polling should be enabled), 0 otherwise.
+ * Try to register routine for polling. Returns 0 if successful
+ * (and polling should be enabled), error code otherwise.
  * A device is not supposed to register itself multiple times.
  *
- * This is called from within the *_intr() functions, so we do not need
- * further ifnet locking.
+ * This is called from within the *_ioctl() functions.
  */
 int
 ether_poll_register(poll_handler_t *h, struct ifnet *ifp)
 {
 	int i;
 
-	NET_ASSERT_GIANT();
+	KASSERT(h != NULL, ("%s: handler is NULL", __func__));
+	KASSERT(ifp != NULL, ("%s: ifp is NULL", __func__));
 
-	if (polling == 0) /* polling disabled, cannot register */
-		return 0;
-	if (h == NULL || ifp == NULL)		/* bad arguments	*/
-		return 0;
-	if ( !(ifp->if_flags & IFF_UP) )	/* must be up		*/
-		return 0;
-	if (ifp->if_flags & IFF_POLLING)	/* already polling	*/
-		return 0;
+	NET_ASSERT_GIANT();
 
 	mtx_lock(&poll_mtx);
 	if (poll_handlers >= POLL_LIST_LEN) {
@@ -474,7 +415,7 @@
 			verbose--;
 		}
 		mtx_unlock(&poll_mtx);
-		return 0; /* no polling for you */
+		return (ENOMEM); /* no polling for you */
 	}
 
 	for (i = 0 ; i < poll_handlers ; i++)
@@ -482,45 +423,39 @@
 			mtx_unlock(&poll_mtx);
 			log(LOG_DEBUG, "ether_poll_register: %s: handler"
 			    " already registered\n", ifp->if_xname);
-			return (0);
+			return (EEXIST);
 		}
 
 	pr[poll_handlers].handler = h;
 	pr[poll_handlers].ifp = ifp;
 	poll_handlers++;
-	ifp->if_flags |= IFF_POLLING;
 	mtx_unlock(&poll_mtx);
 	if (idlepoll_sleeping)
 		wakeup(&idlepoll_sleeping);
-	return 1; /* polling enabled in next call */
+	return (0);
 }
 
 /*
- * Remove interface from the polling list. Normally called by *_stop().
- * It is not an error to call it with IFF_POLLING clear, the call is
- * sufficiently rare to be preferable to save the space for the extra
- * test in each driver in exchange of one additional function call.
+ * Remove interface from the polling list. Called from *_ioctl(), too.
  */
 int
 ether_poll_deregister(struct ifnet *ifp)
 {
 	int i;
 
-	NET_ASSERT_GIANT();
+	KASSERT(ifp != NULL, ("%s: ifp is NULL", __func__));
 
-	if ( !ifp || !(ifp->if_flags & IFF_POLLING) ) {
-		return 0;
-	}
+	NET_ASSERT_GIANT();
 	mtx_lock(&poll_mtx);
+
 	for (i = 0 ; i < poll_handlers ; i++)
 		if (pr[i].ifp == ifp) /* found it */
 			break;
-	ifp->if_flags &= ~IFF_POLLING; /* found or not... */
 	if (i == poll_handlers) {
-		mtx_unlock(&poll_mtx);
 		log(LOG_DEBUG, "ether_poll_deregister: %s: not found!\n",
 		    ifp->if_xname);
-		return (0);
+		mtx_unlock(&poll_mtx);
+		return (ENOENT);
 	}
 	poll_handlers--;
 	if (i < poll_handlers) { /* Last entry replaces this one. */
@@ -528,7 +463,60 @@
 		pr[i].ifp = pr[poll_handlers].ifp;
 	}
 	mtx_unlock(&poll_mtx);
-	return (1);
+	return (0);
+}
+
+/*
+ * Legacy interface for turning polling on all interfaces at one time.
+ */
+static int
+poll_switch(SYSCTL_HANDLER_ARGS)
+{
+	struct ifnet *ifp;
+	int error;
+	int val;
+
+	mtx_lock(&poll_mtx);
+	val = polling;
+	mtx_unlock(&poll_mtx);
+
+	error = sysctl_handle_int(oidp, &val, sizeof(int), req);
+	if (error || !req->newptr )
+		return (error);
+
+	if (val == polling)
+		return (0);
+
+	if (val < 0 || val > 1)
+		return (EINVAL);
+
+	mtx_lock(&poll_mtx);
+	polling = val;
+	mtx_unlock(&poll_mtx);
+
+	NET_LOCK_GIANT();
+	IFNET_RLOCK();
+	TAILQ_FOREACH(ifp, &ifnet, if_link) {
+		if (ifp->if_capabilities & IFCAP_POLLING) {
+			struct ifreq ifr;
+
+			if (val == 1)
+				ifr.ifr_reqcap =
+				    ifp->if_capenable | IFCAP_POLLING;
+			else
+				ifr.ifr_reqcap =
+				    ifp->if_capenable & ~IFCAP_POLLING;
+			IFF_LOCKGIANT(ifp);	/* LOR here */
+			(void) (*ifp->if_ioctl)(ifp, SIOCSIFCAP, (caddr_t)&ifr);
+			IFF_UNLOCKGIANT(ifp);
+		}
+	}
+	IFNET_RUNLOCK();
+	NET_UNLOCK_GIANT();
+
+	log(LOG_ERR, "kern.polling.enable is deprecated. Use ifconfig(8)");
+
+	return (0);
 }
 
 static void
Index: net/if.h
===================================================================
RCS file: /home/ncvs/src/sys/net/if.h,v
retrieving revision 1.98
diff -u -r1.98 if.h
--- net/if.h	9 Aug 2005 12:56:20 -0000	1.98
+++ net/if.h	29 Sep 2005 11:25:05 -0000
@@ -148,7 +148,7 @@
 #define	IFF_LINK2	0x4000		/* per link layer defined bit */
 #define	IFF_ALTPHYS	IFF_LINK2	/* use alternate physical connection */
 #define	IFF_MULTICAST	0x8000		/* (i) supports multicast */
-#define	IFF_POLLING	0x10000		/* (n) Interface is in polling mode. */
+/*			0x10000		*/
 #define	IFF_PPROMISC	0x20000		/* (n) user-requested promisc mode */
 #define	IFF_MONITOR	0x40000		/* (n) user-requested monitor mode */
 #define	IFF_STATICARP	0x80000		/* (n) static ARP */
@@ -166,8 +166,7 @@
 /* flags set internally only: */
 #define	IFF_CANTCHANGE \
 	(IFF_BROADCAST|IFF_POINTOPOINT|IFF_DRV_RUNNING|IFF_DRV_OACTIVE|\
-	    IFF_SIMPLEX|IFF_MULTICAST|IFF_ALLMULTI|IFF_SMART|IFF_PROMISC|\
-	    IFF_POLLING)
+	    IFF_SIMPLEX|IFF_MULTICAST|IFF_ALLMULTI|IFF_SMART|IFF_PROMISC)
 
 /*
  * Values for if_link_state.
Index: net/if_var.h
===================================================================
RCS file: /home/ncvs/src/sys/net/if_var.h,v
retrieving revision 1.102
diff -u -r1.102 if_var.h
--- net/if_var.h	9 Aug 2005 10:16:17 -0000	1.102
+++ net/if_var.h	29 Sep 2005 10:19:43 -0000
@@ -660,7 +660,7 @@
     LLADDR((struct sockaddr_dl *) ifaddr_byindex((ifp)->if_index)->ifa_addr)
 
 #ifdef DEVICE_POLLING
-enum poll_cmd {	POLL_ONLY, POLL_AND_CHECK_STATUS, POLL_DEREGISTER };
+enum poll_cmd {	POLL_ONLY, POLL_AND_CHECK_STATUS };
 
 typedef	void poll_handler_t(struct ifnet *ifp, enum poll_cmd cmd, int count);
 int    ether_poll_register(poll_handler_t *h, struct ifnet *ifp);


More information about the freebsd-net mailing list