[CFR][CFT] counter(9): new API for faster and raceless counters

Gleb Smirnoff glebius at FreeBSD.org
Tue Apr 2 15:40:19 UTC 2013


  Luigi,

On Tue, Apr 02, 2013 at 04:57:22PM +0200, Luigi Rizzo wrote:
L> > Here is patch for review. It adds 4 more primitives:
L> > 
L> > counter_enter();
L> > counter_u64_add_prot(c, x);
L> > counter_u64_subtract_prot(c, x);
L> > counter_exit();
L> 
L> thanks for the patch. I have three more comments:
L> 
L> - is it really necessary to have the "subtract" version ?
L>   Couldn't one just make "x" an int64_t ? or it gives
L>   too many warnings at runtime maybe ?

Agreed. See patch.

L> - (this can be fixed later) in the i386 version, counter_enter()
L>   and counter_exit() have an if statement which may become quite
L>   expensive if mispredicted. Also, the test is repeated 3 times in
L>   counter_u64_add() (enter/add/exit). Hopefully the compiler
L>   optimizes out the extra calls, but the previous version seemed
L>   more readable. Anyways, at some point we should figure out
L>   whether putting likely/unlikely annotations on the result of
L>   (cpu_feature & CPUID_CX8) may improve performance where it matters.

Agreed. See patch.

L> - do you plan to provide an API to initialize a counter to 0 or a
L>   specific value ? I suppose this is done implicitly on allocation,
L>   but there are cases (e.g. ipfw) where the application explicitly
L>   zeroes counters.

There already is counter_u64_zero().

L> > So 63% speedup, not speaking on the fact that in such a tight loop 98% of
L> > parallel updates are lost on racy counter :)
L> > 
L> > A tight loop with atomic_add() is 22 times (twenty two times) slower than
L> > new counter. I didn't bother to run ministat :)
L> 
L> yes i think this really makes justice of the improvements of the new code
L> (i am a bit unclear on what actual test you ran / how many counter_u64_add()
L> per syscall you have, but i do expect the racy counters to be much slower
L> and much less reliable, and the 20x slowdown with atomics is completely
L> expected.)

The test made 2 * 10^6 iterations of updating a counter in a for loop.

-- 
Totus tuus, Glebius.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: counter_API_extend.diff
Type: text/x-diff
Size: 8330 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-arch/attachments/20130402/2c690860/attachment.diff>


More information about the freebsd-arch mailing list