Fast gettimeofday(2) and clock_gettime(2)

Fri Jun 8 08:03:52 UTC 2012

On Thu, 7 Jun 2012, Konstantin Belousov wrote:

> On Thu, Jun 07, 2012 at 01:00:34PM +1000, Bruce Evans wrote:
>>
>> tc_windup()'s close in succession are bugs, since they cycle the timehands
>> faster than they were designed to be.  We already have too many of these
>> bugs (where tc_setclock() calls tc_windup().  I didn't notice this
>> particular problem with it before).  Now I will point out that version
>> 2 of your patch adds more of these calls, apparently to get changes to
>> happen sooner.  But in sysctl_kern_timecounter_hardware(), such a call
>> was intentionaly left out since it is not needed.  Note that tc_tick
>> prevents calls to tc_windup() more often than about once per msec if
>> hz > 1000.
> No, I did not added more tc_windup calls. I added a recalculation
> of the shared page content on the timecounter change, which is not
> the same as tc_windup() call. This is exactly to handle a disable
> of usermode rdtsc use when kernel timecounter hardware changes.

Oops.  I saw a parameter named tc_windup and didn't look too closely
at the event handler for this.  Please use a slightly different name.

Frequent updates of the shared page may cause the same too-fast cycling
as frequent calls to tc_windup().  Are event handlers rate-limited?
If not, then someone changing the timecounter hardware from a loop
in userland could cause similar problems to a settimeofday() loop.
Both are privileged operations so this is not a large problem, but it
is a stress test that should pass.

>>  [jhb wrote]
>>> There was apparently another issue with version 2. The bcopy() is not
>>> atomic, so potentially libc could read wrong tk_current. I redid
>>> the interface to write to the shared page to allow use of real atomics.
>>
>> Timecounter code is supposed to be lock-free except for some time-domain
>> locking.  I only see 1 problem with this: where tc_windup() writes the
>> generation count and other things without asking for these writes to
>> be ordered.  In most cases, the time-domain locking prevents problems.
> In fact, on x86 the ordering is strong enough that no barriers are needed,
> this is why the problem goes unnoticed so far.

Only the x86 write ordering is clearly strong enough (see another reply).

Bruce