Micro-benchmark for various time syscalls...

Tue Jun 3 10:14:24 UTC 2008

On Mon, 2 Jun 2008, Sean Chittenden wrote:

>> I wouldn't expect SMP to make much difference between CLOCK_REALTIME and
>> CLOCK_REALTIME_FAST.  The only difference is that the former calls
>> nanotime() where the latter calls getnanotime().  nanotime() always does
>> more, but it doesn't have any extra SMP overheads in most cases (in rare
>> cases like i386 using the i8254 timecounter, it needs to lock accesses to
>> the timecounter hardware).  gettimeofday() always does more than
>> CLOCK_REALTIME, but again no more for SMP.
>
> You may be right, I can only speculate.  Going off of phk@'s rhetorical 
> questions regarding gettimeofday(2) working across cores/threads, I assumed 
> there would be some kind of synchronization.
>
> http://lists.freebsd.org/mailman/htdig/freebsd-current/2005-October/057280.html

The synchronization is all in binuptime().  It is quite delicate.  It
depends mainly on a unlocked, nonatomically-accessed generation count
for software synchronization and the hardware being almost-automatically
synchronized with itself for hardware synchronization.  It takes various
magic for an unlocked, non-atomically accessed generation count to work.
Since it has no locking and executes identical code for SMP and !SMP, it
has identical overheads for SMP and !SMP.  Hardware is almost-automatically
synchronized with itself by using identical hardware for all CPUs.  This
is what breaks down for the TSC on SMP systems (power management may affect
both).  Some hardware timecounters like the i8254 require locking to give
exclusive access to the hardware.

>>> clock_gettime(CLOCK_REALTIME_FAST) is likely the ideal function for most 
>>> authors (CLOCK_REALTIME_FAST is supposed to be precise to +/- 10ms of 
>>> CLOCK_REALTIME's value[2]).  In fact, I'd assume that CLOCK_REALTIME_FAST 
>>> is just as accurate as Linux's gettimeofday(2) (a statement I can't back 
>>> up, but believe is likely to be correct) and therefore there isn't much 
>>> harm (if any) in seeing clock_gettime(2) + CLOCK_REALTIME_FAST receive 
>>> more widespread use vs. gettimeofday(2).  FYI.  -sc
>> 
>> The existence of most of CLOCK_* is a bug.  I wouldn't use 
>> CLOCK_REALTIME_FAST
>> for anything (if only because it doesn't exist in most kernels that I
>> run.
>
> I think that's debatable, actually.  I modified my little micro-benchmark

It's debateable, but not with me :-).

> program to test the realtime values returned from each execution and found 
> that CLOCK_REALTIME_FAST likely updates itself sufficiently frequently for 
> most applications (not all, but most).  My test ensures that time doesn't go 
> backwards and tally's the number of times that the values are identical. 
> It'd be nice of CLOCK_REALTIME_FAST incremented by a small and reasonable 
> fudge factor every time it's invoked that way the values aren't identical.

I would probably go direct to the hardware if doing a large enough number
of measurements for clock granularity of access overheads to matter.
Otherwise, CLOCK_REALTIME or CLOCK_MONOTIC is best.  These are easy to use
and give the most accurate results possible.

>>> PS  Is there a reason that time(3) can't be implemented in terms of 
>>> clock_gettime(CLOCK_SECOND)?  10ms seems precise enough compared to 
>>> time_t's whole second resolution.
>> 
>> I might use CLOCK_SECOND (unlike CLOCK_REALTIME_FAST), since the low
>> accuracy timers provided by the get*time() family are accurate enough
>> to give the time in seconds.  Unfortunately, they are still broken --
>> they are all incoherent relative to nanotime() and some are incoherent
>> relative to each other.  CLOCK_SECOND can lag the time in seconds given
>> by up to tc_tick/HZ seconds.  This is because CLOCK_SECOND returns the
>> time in seconds at the last tc_windup(), so it misses seeing rollovers
>> of the second in the interval between the rollover and the next
>> tc_windup(), while nanotime() doesn't miss seeing these rollovers so
>> it gives incoherent times, with nanotime()/CLOCK_REALTIME being correct
>> and time_second/CLOCK_SECOND broken.
>
> Interesting.  Incoherent, but accurate enough?  We're talking about a <10ms 
> window of incoherency, right?

Yes.  10ms is a lot.  It results in about 1 in every 100 timestamps being
coherent, so my fs benchmark that tests for file times being coherent
(it actually tests for ctime/mtime/atime updates happening in the correcy
order when file times are incoherent with time(1)) doesn't have to run
for very long to find an incoherency.  After rounding the times to a seconds
boundary, the amount of the incoherency is rounded up from 1-10ms to 1
second.  Incoherencies of 1 second persist for the length of the window.
The delicate locking in binuptime() doesn't allow the data structure updates
that would be required to make all the access methods coherent.  Full
locking would probably be required for that.

>> Some of my benchmark results:
>
> Can I run this same test/see how this was written?

It is an old sanity test program by wollman which I've touched as little
as possible, just to convert to CLOCK_REALTIME and to hack around some
bugs involving array overruns which became larger with the larger range
of values in nanoseconds.  He probably doesn't want to see it, but I
will include it here :-).

%%%
#include <sys/types.h>
#include <stdio.h>
#include <time.h>
#include <sys/time.h>
#include <unistd.h>
#include <math.h>
#include <limits.h>
#include <string.h>

#define N 2000000

int diffs[N];
int hist[N * 10];		/* XXX various assumptions on diffs */

int main(void) {
   int i, j;
   int min, max;
   double sum, mean, var, sumsq;
   struct timespec tv, otv;

   memset(diffs, '\0', sizeof diffs); /* fault in whole array, we hope */
   for(i = 0; i < N; i++) {
     clock_gettime(CLOCK_REALTIME, &tv);
     do {
       otv = tv;
       clock_gettime(CLOCK_REALTIME, &tv);
     } while(tv.tv_sec == otv.tv_sec && tv.tv_nsec == otv.tv_nsec);
     diffs[i] = tv.tv_nsec - otv.tv_nsec + 1000000000 * (tv.tv_sec - otv.tv_sec);
   }

   min = INT_MAX;
   max = INT_MIN;
   sum = 0;
   sumsq = 0;
   for(i = 0; i < N; i++) {
     if(diffs[i] > max) max = diffs[i];
     if(diffs[i] < min) min = diffs[i];
     sum += diffs[i];
     sumsq += diffs[i] * diffs[i];
   }

   mean = sum / (double)N;
   var = (sumsq - 2 * mean * sum + sum * mean) / (double)N;

   printf("min %d, max %d, mean %f, std %f\n", min, max, mean, sqrt(var));

   for(i = 0; i < N; i++) {
     hist[diffs[i]]++;
   }

   for(j = 0; j < 5; j++) {
     max = 0;
     min = 0;
     for(i = 0; i < N; i++) {
       if(hist[i] > max) {
         max = hist[i];
         min = i;                /* xxx */
       }
     }
     printf("%dth: %d (%d observations)\n", j + 1, min, max);
     hist[min] = 0;
   }

   return 0;
}
%%%

>> Other implementation bugs (all in clock_getres()):
>> - all of the clock ids that use getnanotime() claim a resolution of 1
>>  nsec, but that us bogus.  The actual resolution is more like tc_tick/HZ.
>>  The extra resolution in a struct timespec is only used to return
>>  garbage related to the incoherency of the clocks.  (If it could be
>>  arranged that tc_windup() always ran on a tc_tick/HZ boundary, then
>>  the clocks would be coherent and the times would always be a multiple
>>  of tc_tick/HZ, with no garbage in low bits.)
>> - CLOCK_VIRTUAL and CLOCK_PROF claim a resolution of 1/hz, but that is
>>  bogus.  The actual resolution is more like 1/stathz, or perhaps 1
>>  microsecond.  hz is irrelevant here since statclock ticks are used.
>>  statclock ticks only have a resolution of 1/stathz, but if 1 nsec is
>>  correct for CLOCK_REALTIME_FAST, then 1 usec is correct here since
>>  caclru() calculates the time to a resolution of 1 usec; it is just
>>  very inaccurate at that resolution.
>> "Resolution" is a poor term for the functionality needed here.  I think
>> a hint about the accuracy is more important.  In simple implementations
>> using interrupts and ticks, the accuracy would be about the the same as
>> the resolution, but FreeBSD is more complicated.
>
> Is there any reason that the garbage resolution can't be zero'ed out to 
> indicate confidence of the kernel in the precision of the information?  -sc

Well, I only recently decided that "garbage" is the right way to think
of the extra precision.  Some care would be required to not increase
incoherency when discarding the garbage.

Bruce