network statistics in SMP
Bruce Evans
brde at optusnet.com.au
Thu Dec 17 08:10:29 UTC 2009
On Thu, 17 Dec 2009, Bruce Evans wrote:
> ...
> Actually, you can do better with a generation count. The generation count
> would at least tell you if you lost a race. The generation count should
> only be maintained while summing other counts, since it must be global and
> incremented by atomic ops (to avoid the races without even more costly
> locking which would make the generation count irrelevant) so maintaining
> it all the time would more than defeat the point of having per-CPU counters
> (all CPUs would compete for it at the same address). ...
Actually3, the generation count can be per-CPU and accessed without atomic
ops (provided reads of it on other CPUs return a consistent possibly-stale
value).
> Simple version:
> - bloat PCPU_INC(var) to do something like the following:
> if (PCPU_GET(counter_summing_mode))
> atomic_add_int(&counter_gen, 1);
> OLD_PCPU_INC(var);
> - set PCPU_GET(counter_summing_mode) while summing. Needs heavyweight
> synchronization (IPIs?) to set and clear the flag on other CPUs. Must
> also make all other CPUs flush pending writes (so that a 64-bit counter
> cannot be half-written at the beginning of the summing), but this will
> happen automatically with any heavyweight synchronization.
Better version:
- bloat PCPU_INC(var) to do something like the following:
OLD_PCPU_INC(counter_gen);
OLD_PCPU_INC(var);
- sum all PCPU_GET(counter_gen) before summing the subset of ordinary
counters of interest. This gives a value <= the unracy current sum
of the generation counters, by reading consistent possibly-stale
values.
Then sync all counters as above. Note that the order of the above
increments would be backwards if we used write ordering instead of
a full sync -- with only write ordering the sum of the generation
counts would be too high here if we happened to read it on 1 of the
CPUs in between the above increments. This order is chosen since
I don't want to have 2 increments of counter_gen in the above and/or
further complications and bloat, so there must be some order, and
the above order works right later.
Then sum selected ordinary counters.
Then sync the generation counters (or all counters, or arrange for
write ordering) as above.
Then sum the generation counters. This gives a value >= the unracy
current sum at the end of summing the selected counters.
Bruce
More information about the freebsd-arch
mailing list