clock error: callouts are run one tick late
Luigi Rizzo
rizzo at iet.unipi.it
Wed Sep 9 01:10:11 UTC 2009
On Wed, Sep 09, 2009 at 03:01:37AM +0200, Luigi Rizzo wrote:
> RELENG_8/amd64 (can not try on i386) has the following problem:
>
> callout_reset(..., t, ...)
>
> processes the callout after t+1 ticks instead of the t ticks
> that one sees on RELENG_7 and earlier.
>
> I found it by pure chance, noticing that dummynet on RELENG_8
> has a jitter that is two ticks instead of one tick.
> Other systems with rely on frequent callouts might see
> problems as well.
>
> An indirect way to see the problem is the following:
>
> kldload dummynet
>
> sysctl net.inet.ip.dummynet.tick_adjustment; \
> sleep 1; sysctl net.inet.ip.dummynet.tick_adjustment
>
> on a working system, the variable should remain mostly unchanged;
> on a non-working system you see it growing at a rate HZ/2
>
> I suspect the bug is introduced by the change in kern_timeout.c rev. 1.114
>
> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/kern/kern_timeout.c.diff?r1=1.113;r2=1.114
>
> which changes softclock() to stop before the current 'ticks'
> so processes everything one tick late.
>
> I understand the race described in the cvs log, but this does
> not seem a proper fix -- it violates POLA by changing the semantics
> of callout_reset(), and does not really fix the race, but just adds
> an extra uncertainty of 1 tick on when a given callout will be run
>
> If the problem is a race between hardclock() which updates 'ticks',
> and the various hardclock_cpu() instances which might arrive early,
> I would suggest two alternative options:
>
> 1. create a per-cpu 'ticks' (say a field cc_ticks in struct callout_cpu),
> increment it at the start of hardclock_cpu, and use cc->ticks
> instead of 'ticks' in the various callout functions involved
> with manipulation of the callwheel
> callout_tick(), softclock(), callout_reset_on()
>
> 2. start softclock() at cc->cc_softticks -1, i.e.
>
> ...
> CC_LOCK(cc)
> - while (cc->cc_softticks != ticks) {
> + while (cc->cc_softticks-1 != ticks) {
> ...
#2 also need this change in callout_tick()
mtx_lock_spin_flags(&cc->cc_lock, MTX_QUIET);
- for (; (cc->cc_softticks - ticks) < 0; cc->cc_softticks++) {
+ for (; (cc->cc_softticks - ticks) <= 0; cc->cc_softticks++) {
bucket = cc->cc_softticks & callwheelmask;
Just tested it, it seems to fix the problem.
cheers
luigi
More information about the freebsd-current
mailing list