svn commit: r209119 - head/sys/sys

Fri Jun 18 01:57:47 UTC 2010

On 06/17/10 17:13, Kostik Belousov wrote:
> On Thu, Jun 17, 2010 at 12:38:08PM +1000, Lawrence Stewart wrote:
>> On 06/14/10 20:43, Kostik Belousov wrote:
[snip]
>>> Or, you could ditch the sum at all, indeed using ({}) and returning the
>>> result. __typeof is your friend to select proper type of accumulator.
>>
>> So, something like this?
>>
>> #define DPCPU_SUM(n, var) __extension__                                \
>> ({                                                                     \
>>          u_int _i;                                                      \
>>          __typeof((DPCPU_PTR(n))->var) sum;                             \
>>                                                                         \
>>          sum = 0;                                                       \
>>          CPU_FOREACH(_i) {                                              \
>>                  sum += (DPCPU_ID_PTR(_i, n))->var;                     \
>>          }                                                              \
>>          sum;                                                           \
>> })
>>
>> Which can be used like this:
>>
>> totalss.n_in = DPCPU_SUM(ss, n_in);
> Yes, exactly.
>
>>
>>
>> I've tested the above and it works. I also prefer the idea of having
>> DPCPU_SUM return the sum so that you can do "var = DPCPU_SUM(...)". My
>> only concern with this method is that the caller no longer has the
>> choice to make the sum variable a larger type to avoid overflow. It
>> would be nice to be able to have the DPCPU vars be uint32_t but be able
>> to sum them into a uint64_t accumulator for example. Perhaps this isn't
>> really an issue though... I'm not sure.
> You are worried about overflow in the sum of 32 or 64 variables, but if
> this is the case, then each member of the sum can overflow as well, IMO.
> Either ignore the issue, or use a uintmax_t.

True but I figured on large SMP systems where the potential to process 
more is likely, 32bit counters per cpu may be enough to avoid overflow 
but the aggregate number of events may exceed a 32bit variable. I 
suspect you're right though and that if there's a likely chance the 
aggregate could overflow, then the DPCPU var should simply be made 64bit 
to also remove any possibility of individual PCPU counters overflowing.

I'll commit the above version of the macro this evening (GMT+10) unless 
I hear any objections. Thanks to all of you for your input.

Cheers,
Lawrence