cvs commit: src/usr.bin/vmstat vmstat.c src/usr.bin/w w.c

Thu Oct 20 09:03:15 PDT 2005

On Thu, 20 Oct 2005, Poul-Henning Kamp wrote:

> In message <20051020211131.A874 at delplex.bde.org>, Bruce Evans writes:
>
>> POSIX's specification of CLOCK_MONOTONIC seems to be missing leap seconds
>> problems.  It seems to be required to actually work; thus it should give
>> the difference in real time, in seconds with nanoseconds resolution,
>> relative to its starting point, so it must include leap seconds.
>
> It doesn't make sense to talk about a leap seconds on a timescale
> which is not UTC because leap-seconds by definition only exist in UTC.

That's true for time_t's, but for differences between times there is
no UTC (or time_t's).  Leap seconds (if they happen) are just ordinary
seconds in differences.

> CLOCK_MONOTONIC is defined as a count of seconds (lets tacitly
> assume they mean SI seconds here) from an arbitrary origin.
>
> A better and unambiguous way to write that would have been:
>
> 	CLOCK_MONOTONIC = TAI + alpha
>
> It follows from this that CLOCK_MONOTONIC does not know what a
> leap-second is and doesn't notice them happening.

Same for difftime().  Even if time_t is specified to be broken, difftime()
doesn't have to be; it can handle leap seconds just as uneasily as
localtime().

> Because of our particular choice of alpha, CLOCK_MONOTONIC is also
> a very convenient measure of how many seconds the kernel has been
> running.

But slightly wrong.

>> Of course it can't reasonably be expected to have nanoseconds accuracy.
>> [...]
>
> It certainly can and should be expected to and it does.

Nah, it only has nanoseconds precision, since reading timecounters takes
several nanosecond (several hundred for the ACPI timecounter) and you
can't control the timing of the start of the read.

>> difftime() also seems to be required to actually work.  According to
>> draft C99 (n869.txt):
>>
>> %        [#2] The difftime function computes the  difference  between
>> %        two calendar times: time1 - time0.
>
> Again, this is another example of computer-geeks missing the finer
> points in timekeeping.
>
> The word "calendar" refers to things counting time in units of days.
>
> The above text therefore conveys no usable information about how
> leap-seconds should be accounted for, since leap seconds by definition
> are intra-day.

It's supposed to be an informal definition, since the details are large
and belong in a more specialized standard.

>> [clarification from other standard that the two time_t comes from time(2)]
>
>> time_t's cannot be naively subtracted in general in C, so the difference
>> here must be formal.  The difference is required to contain leap seconds
>> by POLA.
>
> Yeah, right: in your dreams...

In localtime.c.

> You can by definition not implment difftime correctly since the
> time_t timescale does not contain any indication of leapseconds.
>
> This means that there is no way to tell which side of an inserted
> leapsecond a time(2) timestamp comes from:
>
> 	UTC		time(2)
> 	23:59:57	N-3
> 	23:59:58	N-2
> 	23:59:59	N-1
> 	23:59:60	N
> 	00:00:00	N
> 	00:00:01	N+1
>
> Worst case, difftime() will be wrong by two seconds: taking the difference
> from one leapsecond to another and guessing wrong in both ends.

It only has to be wrong by 1 or 2 seconds for short intervals when a leap
seconds occurs.  Not adjusting makes difftime() wrong across all intervals
containing a leap second, with an error of the number of leap seconds in
the interval (+- 1 or 2 for leap seconds at endpoints).

> The fact that mktime() and timegm() gets it wrong the other way because
> of DWIM logic is merely ising on the cake.
>
>> Back to the utilities: according to the standards, it seems to be equally
>> correct to implement "double uptime()" as:
>>
>> 	/* Done in kernel; happens to give 0 in FreeBSD implementation: */
>> 	clock_gettime(CLOCK_MONOTONIC, &boottime);
>>
>> 	clock_gettime(CLOCK_MONOTONIC, &now);
>>
>> 	return (now.tv_sec - boottime.tv_sec +
>> 	    1e-9 * (now.tv_nsec - boottime.tv_nsec);
>
> On FreeBSD this delivers the correct answer.

Nope.  As I already explained, this drifts at the same rate as CLOCK_REALTIME
(possibly 0 on average if you correct the drift using micro-adjustments
(*adjtime*()).  Then stepping the clock using a macro-adjustment (*settime*())
may fix the drift in CLOCK_REALTIME but always leaves it in CLOCK_MONOTONIC.

>> and as:
>>
>> 	/* Done in kernel; nonzero except if you booted in 1970: */
>> 	clock_gettime(CLOCK_REALTIME, &boottime);
>>
>> 	clock_gettime(CLOCK_REALTIME, &now);
>>
>> 	/* Restore leap seconds if necessary; lose nanoseconds resolution: */
>> 	return difftime(now.tv_sec, boottime.tv_sec);
>
> This suffers from the +/-2 second error from difftime(2) and will
> return the wrong result if the clock is stepped.

The errors from stepping are because stepping bogusly changes `boottime'
so as to make adjkerntz -i and old implementations of uptime() work.
Changing boottime is bogus because the boot time is whatever it is; it
doesn't change just because the clock drifts after booting and is fixed
later by stepping it.  Adjusting boottime makes both methods have the
same error from stepping.

Bruce