svn commit: r200510 - head/sys/kern

Tue Dec 15 07:46:12 PST 2009

On Tue, 15 Dec 2009, Pieter de Goeje wrote:

> On Monday 14 December 2009 15:46:35 Luigi Rizzo wrote:
>> On Mon, Dec 14, 2009 at 02:18:42PM +0000, Robert Watson wrote:
>>> On Mon, 14 Dec 2009, Luigi Rizzo wrote:
>>
>> ...
>>
>>>> Together with a smaller patch committed in september, this fixes a
>>>> bug that affects 8.0 with apps that rely on callouts to fire exactly
>>>> in the number of ticks specified (qemu among them).
>>>> Right now, callouts in 8.0 fire one tick late.
>>>>
>>>> This was discussed in september with JeffR and jhb
>>>
>>> Once this has burned in, is it something you would consider appropriate
>>> to be an errata note candidate?
>>
>> i have no objection, but at the time someone commented that
>> callouts do not _guarantee_ when they will run so strictly speaking
>> this is not a bug (i do think that being always a tick late _is_ a bug).
>
> As a person running a couple of game servers which rely on nanosleep to get a
> fixed number of frames per second, I'd say that it is a bug.

Being a tick late is certainly a bug.  Relying on nanosleep to get a
fixed number of frames per second is another bug.  If you want a
periodic timer, setitimer(2) with a nonzero it_value (so that the timer
repeats automatically) must be used.

> This might also
> affect video players which want to show their frames on time.  The default HZ
> of 1000 mitigates the problem somewhat, but on for example a laptop running at
> HZ=100 the error is noticeable.
> To illustrate my point, calling usleep(1) 100 times in a loop results in a
> running time of 3 seconds with kern.hz=100 (measured on 8.x from Dec 9th),
> which is 3 times as long as one might reasonably expect. This suggests that
> the callout fires 2 ticks late ...

Only 1 tick late.  I get a running time of 2 seconds with hz = 100 under
FreeBSD-~5.2, presumably because 5.2 didn't have the 1-tick-late bug.

The time is expected to be 2 seconds instead of 1 because nanosleep()
adds an extra 1 tick though it would work right (but slower) with other
small changes (also pessimizations) if it didn't.  To sleep for 1
microsecond, it is always necessary to wait until the next tick for
obvious reasons.  The next tick might occur in less than a microsecond
(when the timeout happens to be set up just before the tick), so
nanosleep() can't just return when the tick occurs.  It should check
if the timeout has expired (in real time, not ticks) and wait for
another tick if not.  In fact, it already does this in order to be
reasonably accurate for long timeouts.  However, to be simple and
efficient, it just waits for an extra tick initially, using generic
code that adds 1 to the tick count.  Other uses of the generic code
don't check that the timeout has expired so they need this extra 1
for correctness, but nanosleep() only needs it for efficiency.  This
optimization for efficiency is more historical than intentional.
nanosleep() also uses a fuzzy check (getnanouptime() instead of
nanouptime() for the expiry, which can give an error of about 1 tick
With the fuzzy check, the extra 1 might still be needed for correctness

Thus when hz = 100, 100 nanosleeps for 1 microsecond (or even 1
nanosecond) take between about 1 and 2 seconds, with an average of 1.5
seconds for random calls and an average of 2 seconds for synchronized
calls.  Sequential calls with no other system activity give synchronized
calls.  The time of 2/hz for synchronized calls can be depended on if
there is no other system activity, but it is better to use setitimer()
as above -- then cases with other system activity have a better chance
of working, and you can also get a time of 1/hz.

nanosleep() is correct but very sloppy for  for long sleeps.  E.g., with
hz = 1000 on i386(i8254), the average absolute error for a set of perfectly
calibrated i8254 ticks is about 0.02%, so for sleeps of 1 year the error
will be at best 1.82 hours on average.  If the hz clock runs faster than
real time, then nanosleep() wakes up early and does shorter sleeps to
reach the correct real time, but if the hz clock runs slow then nanosleep()
normally wakes up hours per year late.  This is easy to fix at a cost of
efficiency by intentionally underestimating the timeout in ticks, which
goes well with not adding 1.

Bruce