svn commit: r218232 - head/sys/netinet
Julian Elischer
julian at freebsd.org
Fri Feb 4 18:56:15 UTC 2011
On 2/4/11 9:38 AM, Robert Watson wrote:
>
> On Thu, 3 Feb 2011, John Baldwin wrote:
>
>>> 1) Move per John Baldwin to mp_maxid
>>> 2) Some signed/unsigned errors found by Mac OS compiler (from
>>> Michael)
>>> 3) a couple of copyright updates on the effected files.
>>
>> Note that mp_maxid is the maxium valid ID, so you typically have to
>> do things like:
>>
>> for (i = 0; i <= mp_maxid; i++) {
>> if (CPU_ABSENT(i))
>> continue;
>> ...
>> }
>>
>> There is a CPU_FOREACH() macro that does the above (but assumes you
>> want to skip over non-existent CPUs).
>
> I'm finding the network stack requires quite a bit more along these
> lines, btw. I'd love also to have:
>
> PACKAGE_FOREACH()
> CORE_FOREACH()
> HWTHREAD_FOREACH()
>
I agree, which is why I usually support adding such iterators though
some people scream about them.
(e.g. FOREACH_THREAD_IN_PROC and there is one for iterating through
vnets too.)
> CURPACKAGE()
> CURCORE()
> CURTHREAD()
also current jail, vnet, etc. (these (kinda) exist)
>
> Available when putting together thread worker pools, distributing
> work, identifying where to channel work, making dispatch decisions
> and so on. It seems likely that in some scenarios, it will be
> desirable to have worker thread topology linked to hardware topology
> -- for example, a network stack worker per core, with distribution
> of work targeting the closest worker (subject to ordering
> constraints)...
>
>> Hmmm, this is more complicated. Can sctp_queue_to_mcore() handle
>> the fact that a cpu_to_use value might not be valid? If not you
>> might want to maintain a separate "dense" virtual CPU ID table
>> numbered 0 .. mp_ncpus - 1 that maps to "present" FreeBSD CPU IDs.
>> I think Robert has done something similar to support RSS in TCP.
>> Does that make sense?
>
> This proves somewhat complicated. I basically have two models,
> depending on whether RSS is involved (which adds an external
> factor). Without RSS, I build a contiguous workstream number space,
> which is then mapped via a table to the CPU ID space, allowing
> mappings and hashing to be done easily -- however, these refer to
> ordered flow processing streams (i.e., "threads") rather than CPUs,
> in the strict sense. In the future with dynamic configuration, this
> becomes important because what I do is rebalance ordered processing
> streams rather than work to CPUs. With RSS there has to be a link
> between work distribution and the CPU identifiers shared by device
> drivers, hardware, etc, in which case RSS identifies viable CPUs as
> it starts (probably not quite correctly, I'll be looking for a
> review of that code shortly, cleaning it up currently).
>
> This issue came up some at the BSDCan devsummit last year: as more
> and more kernel subsystems need to exploit parallelism explicitly,
> the thread programming model isn't bad, but lacks a strong tie to
> hardware topology in order to help manage work distribution. One
> idea idly bandied around was to do something along the lines of
> KSE/GCD for the kernel: provide a layered "work" model with ordering
> constraints, rather than exploit threads directly, for work-oriented
> subsystems. This is effectively what netisr does, but in a network
> stack-specific way. But with crypto code, IPSEC, storage stuff,
> etc, all looking to exploit parallelism, perhaps a more general
> model is called for.
>
> Robert
>
More information about the svn-src-all
mailing list