[CFR][CFT] counter(9): new API for faster and raceless counters
glebius at FreeBSD.org
Tue Apr 2 15:40:19 UTC 2013
On Tue, Apr 02, 2013 at 04:57:22PM +0200, Luigi Rizzo wrote:
L> > Here is patch for review. It adds 4 more primitives:
L> > counter_enter();
L> > counter_u64_add_prot(c, x);
L> > counter_u64_subtract_prot(c, x);
L> > counter_exit();
L> thanks for the patch. I have three more comments:
L> - is it really necessary to have the "subtract" version ?
L> Couldn't one just make "x" an int64_t ? or it gives
L> too many warnings at runtime maybe ?
Agreed. See patch.
L> - (this can be fixed later) in the i386 version, counter_enter()
L> and counter_exit() have an if statement which may become quite
L> expensive if mispredicted. Also, the test is repeated 3 times in
L> counter_u64_add() (enter/add/exit). Hopefully the compiler
L> optimizes out the extra calls, but the previous version seemed
L> more readable. Anyways, at some point we should figure out
L> whether putting likely/unlikely annotations on the result of
L> (cpu_feature & CPUID_CX8) may improve performance where it matters.
Agreed. See patch.
L> - do you plan to provide an API to initialize a counter to 0 or a
L> specific value ? I suppose this is done implicitly on allocation,
L> but there are cases (e.g. ipfw) where the application explicitly
L> zeroes counters.
There already is counter_u64_zero().
L> > So 63% speedup, not speaking on the fact that in such a tight loop 98% of
L> > parallel updates are lost on racy counter :)
L> > A tight loop with atomic_add() is 22 times (twenty two times) slower than
L> > new counter. I didn't bother to run ministat :)
L> yes i think this really makes justice of the improvements of the new code
L> (i am a bit unclear on what actual test you ran / how many counter_u64_add()
L> per syscall you have, but i do expect the racy counters to be much slower
L> and much less reliable, and the 20x slowdown with atomics is completely
The test made 2 * 10^6 iterations of updating a counter in a for loop.
Totus tuus, Glebius.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 8330 bytes
Desc: not available
More information about the freebsd-arch