[CFR][CFT] counter(9): new API for faster and raceless counters

Tue Apr 2 01:36:47 UTC 2013

On Mon, Apr 01, 2013 at 03:51:28PM +0400, Gleb Smirnoff wrote:
>   Hi!
> 
>   Together with Konstantin Belousov (kib@) we developed a new API that is
> initially purposed for (but not limited to) collecting statistical
> data in kernel.
...

really great work, thanks for tacking this.
I have some comments inline:

API:

> o MI implementation of counter_u64_add() is:
> 
>      critical_enter();
>      *(uint64_t *)zpcpu_get(c) += inc;
>      critical_exit();

- there are several places which use multiple counters
  (e.g. packet and byte counters, global and per flow/socket),
  so i wonder if it may help to provide a "protected" version of
  counter_u64_add() that requires the critical_enter/exit
  only once. Something like

	PROTECT_COUNTERS(
		safe_counter_u64_add(c, x);
		safe_counter_u64_add(c, x);
		safe_counter_u64_add(c, x);
	);

  where PROTECT_COUNTERS() would translate into the critical_enter/exit
  where required, and nothing on other architectures.

...

BENCHMARK:

> I've got a simple benchmark. A syscall that solely updates a counter is
> implemented. Number of processes is spawned, equal to number of cores,
> each process binds itself to a dedicated CPU and calls the syscall 10^6
> times and exits. Parent wait(2)s for them all and I measure real time of

- I am under the impression that these benchmarks are dominated
  by the syscall time, and the new counters would exhibit a lot
  better relative performance (compared to racy or atomic)
  by doing 10..100 counter ops per syscall. Any chance to retry
  the test in this configuration ?

cheers
luigi