svn commit: r275136 - in head/sys: dev/e1000 dev/ixgbe kern sys

John Baldwin jhb at freebsd.org
Mon Dec 1 14:48:39 UTC 2014


On Wednesday, November 26, 2014 08:19:36 PM Alfred Perlstein wrote:
> Author: alfred
> Date: Wed Nov 26 20:19:36 2014
> New Revision: 275136
> URL: https://svnweb.freebsd.org/changeset/base/275136
> 
> Log:
>   Make igb and ixgbe check tunables at probe time.
> 
>   This allows one to make a kernel module to tune the
>   number of queues before the driver loads.
> 
>   This is needed so that a module at SI_SUB_CPU can set
>   tunables for these drivers to take.  Otherwise getenv
>   is called too early by the TUNABLE macros.
> 
>   Reviewed by: smh
>   Phabric: https://reviews.freebsd.org/D1149

The explicit TUNABLE_INT_FETCH strikes me as the wrong approach and hackish.  
That is, each time you want to frob a tunable like this you will have to go 
add a bunch of code to explicitly re-read the tunable, etc.  Instead, you 
could have simply changed your helper module to use kernel_sysctlbyname() to 
set hw.igb.num_queues.  That would have required only a single character 
change to make the SYSCTL read/write instead of read/only for each tunable in 
question.  To be completely future proof (i.e. to handle loading the module in 
question post-boot), your module could both do setenv() and 
kernel_sysctlbyname().  That seems to be a more extensible approach in terms 
of allowing more of these to be changed in the future without having to do a 
manual bypass of the existing tunable infrastructure for each tunable.  I 
would much prefer that you revert this part and change the relevant tunables 
to CTLFLAG_RWTUN and update your out-of-tree module to use 
kernel_sysctlbyname().

Also, if you didn't run this by the Intel folks (e.g. jfv@) that might have 
been nice as a courtesy.

Also, please use the existing resource_int_value() that uses 'hint.igb.0.foo'
instead of inventing a different wheel (device_get_int()).  This is what all 
the other drivers in the tree that provide per-instance tunables use.  You
could perhaps have device_get_int() as a wrapper around resource_int_value(),
but we should use the existing convention, not invent a conflicting one.

> Modified:
>   head/sys/dev/e1000/if_igb.c
>   head/sys/dev/ixgbe/ixgbe.c
>   head/sys/kern/subr_bus.c
>   head/sys/sys/bus.h
> 
> Modified: head/sys/dev/e1000/if_igb.c
> ============================================================================
> == --- head/sys/dev/e1000/if_igb.c	Wed Nov 26 18:03:25 2014	(r275135) 
+++
> head/sys/dev/e1000/if_igb.c	Wed Nov 26 20:19:36 2014	(r275136) @@ 
-188,6
> +188,7 @@ static char *igb_strings[] = {
>  /*********************************************************************
>   *  Function prototypes
>   *********************************************************************/
> +static int	igb_per_unit_num_queues(SYSCTL_HANDLER_ARGS);
>  static int	igb_probe(device_t);
>  static int	igb_attach(device_t);
>  static int	igb_detach(device_t);
> @@ -493,6 +494,11 @@ igb_attach(device_t dev)
>  	    OID_AUTO, "nvm", CTLTYPE_INT|CTLFLAG_RW, adapter, 0,
>  	    igb_sysctl_nvm_info, "I", "NVM Information");
> 
> +        SYSCTL_ADD_PROC(device_get_sysctl_ctx(dev),
> +                        SYSCTL_CHILDREN(device_get_sysctl_tree(dev)),
> +			OID_AUTO, "num_queues", CTLTYPE_INT | CTLFLAG_RD,
> +			adapter, 0, igb_per_unit_num_queues, "I", "Number of Queues");
> +

You don't need igb_per_unit_num_queues().
SYSCTL_ADD_INT(..., &adapter->num_queues) should have been used instead.

> @@ -2831,6 +2837,7 @@ igb_setup_msix(struct adapter *adapter)
>  {
>  	device_t	dev = adapter->dev;
>  	int		bar, want, queues, msgs, maxqueues;
> +	int		n_queues;
> 
>  	/* tuneable override */
>  	if (igb_enable_msix == 0)
> @@ -2858,8 +2865,18 @@ igb_setup_msix(struct adapter *adapter)
>  		goto msi;
>  	}
> 
> -	/* Figure out a reasonable auto config value */
> -	queues = (mp_ncpus > (msgs-1)) ? (msgs-1) : mp_ncpus;
> +	n_queues = 0;
> +	/* try more specific tunable, then global, then finally default to boot
> time tunable if set. */ +	if (device_getenv_int(dev, "num_queues",
> &n_queues) != 0) {
> +		device_printf(dev, "using specific tunable num_queues=%d", n_queues);
> +	} else if (TUNABLE_INT_FETCH("hw.igb.num_queues", &n_queues) != 0) {
> +		if (igb_num_queues != n_queues) {
> +			device_printf(dev, "using global tunable hw.igb.num_queues=%d",
> n_queues); +			igb_num_queues = n_queues;
> +		}
> +	} else {
> +		n_queues = igb_num_queues;
> +	}
> 
>  #ifdef	RSS
>  	/* If we're doing RSS, clamp at the number of RSS buckets */
> @@ -2867,10 +2884,12 @@ igb_setup_msix(struct adapter *adapter)
>  		queues = rss_getnumbuckets();
>  #endif
> 
> -
> -	/* Manual override */
> -	if (igb_num_queues != 0)
> -		queues = igb_num_queues;
> +	if (n_queues != 0) {
> +		queues = n_queues;
> +	} else {
> +		/* Figure out a reasonable auto config value */
> +		queues = (mp_ncpus > (msgs-1)) ? (msgs-1) : mp_ncpus;
> +	}

This moves the mp_ncpus auto config below the RSS bits, so now you
are ignoring the RSS value.  You probably should have left the
mp_ncpus line where it was.

One old bug here is that igb checked the tunable value twice.  It
should only check it once at the end of all the auto-tuning.

>  	/* Sanity check based on HW */
>  	switch (adapter->hw.mac.type) {
> @@ -2893,12 +2912,17 @@ igb_setup_msix(struct adapter *adapter)
>  			maxqueues = 1;
>  			break;
>  	}
> -	if (queues > maxqueues)
> +	if (queues > maxqueues) {
> +		device_printf(adapter->dev, "requested %d queues, but max for this
> adapter is %d\n", +		    queues, maxqueues);
>  		queues = maxqueues;
> -
> -	/* Manual override */
> -	if (igb_num_queues != 0)
> -		queues = igb_num_queues;
> +	} else if (queues == 0) {
> +		queues = 1;
> +	} else if (queues < 0) {
> +		device_printf(adapter->dev, "requested %d queues, but min for this
> adapter is %d\n", +		    queues, 1);
> +		queues = 1;
> +	}

So here is where I would check the tunable to allow it to override.  Assuming 
you revert the TUNABLE_INT_FETCH business then this only needs:

	/* Manual override */
	if (device_get_int(&n_queues) != 0)
		n_queues = igb_num_queues;
	if (n_queues != 0) {
		if (n_queues > maxqueues || n_queues < 0)
			/* whine */
		else
			queues = n_queues;
	}

-- 
John Baldwin


More information about the svn-src-head mailing list