svn commit: r200510 - head/sys/kern

Tue Dec 15 09:39:14 PST 2009

On Tuesday 15 December 2009 16:46:07 Bruce Evans wrote:
> On Tue, 15 Dec 2009, Pieter de Goeje wrote:
> > On Monday 14 December 2009 15:46:35 Luigi Rizzo wrote:
> >> On Mon, Dec 14, 2009 at 02:18:42PM +0000, Robert Watson wrote:
> >>> On Mon, 14 Dec 2009, Luigi Rizzo wrote:
> >>
> >> ...
> >>
> >>>> Together with a smaller patch committed in september, this fixes a
> >>>> bug that affects 8.0 with apps that rely on callouts to fire exactly
> >>>> in the number of ticks specified (qemu among them).
> >>>> Right now, callouts in 8.0 fire one tick late.
> >>>>
> >>>> This was discussed in september with JeffR and jhb
> >>>
> >>> Once this has burned in, is it something you would consider appropriate
> >>> to be an errata note candidate?
> >>
> >> i have no objection, but at the time someone commented that
> >> callouts do not _guarantee_ when they will run so strictly speaking
> >> this is not a bug (i do think that being always a tick late _is_ a bug).
> >
> > As a person running a couple of game servers which rely on nanosleep to
> > get a fixed number of frames per second, I'd say that it is a bug.
> 
> Being a tick late is certainly a bug.  Relying on nanosleep to get a
> fixed number of frames per second is another bug.  If you want a
> periodic timer, setitimer(2) with a nonzero it_value (so that the timer
> repeats automatically) must be used.
> 
> > This might also
> > affect video players which want to show their frames on time.  The
> > default HZ of 1000 mitigates the problem somewhat, but on for example a
> > laptop running at HZ=100 the error is noticeable.
> > To illustrate my point, calling usleep(1) 100 times in a loop results in
> > a running time of 3 seconds with kern.hz=100 (measured on 8.x from Dec
> > 9th), which is 3 times as long as one might reasonably expect. This
> > suggests that the callout fires 2 ticks late ...
> 
> Only 1 tick late.  I get a running time of 2 seconds with hz = 100 under
> FreeBSD-~5.2, presumably because 5.2 didn't have the 1-tick-late bug.
> 
> The time is expected to be 2 seconds instead of 1 because nanosleep()
> adds an extra 1 tick though it would work right (but slower) with other
> small changes (also pessimizations) if it didn't.  To sleep for 1
> microsecond, it is always necessary to wait until the next tick for
> obvious reasons.  The next tick might occur in less than a microsecond
> (when the timeout happens to be set up just before the tick), so
> nanosleep() can't just return when the tick occurs.  It should check
> if the timeout has expired (in real time, not ticks) and wait for
> another tick if not.  In fact, it already does this in order to be
> reasonably accurate for long timeouts.  However, to be simple and
> efficient, it just waits for an extra tick initially, using generic
> code that adds 1 to the tick count.  Other uses of the generic code
> don't check that the timeout has expired so they need this extra 1
> for correctness, but nanosleep() only needs it for efficiency.  This
> optimization for efficiency is more historical than intentional.
> nanosleep() also uses a fuzzy check (getnanouptime() instead of
> nanouptime() for the expiry, which can give an error of about 1 tick
> With the fuzzy check, the extra 1 might still be needed for correctness
> 
> Thus when hz = 100, 100 nanosleeps for 1 microsecond (or even 1
> nanosecond) take between about 1 and 2 seconds, with an average of 1.5
> seconds for random calls and an average of 2 seconds for synchronized
> calls.  Sequential calls with no other system activity give synchronized
> calls.  The time of 2/hz for synchronized calls can be depended on if
> there is no other system activity, but it is better to use setitimer()
> as above -- then cases with other system activity have a better chance
> of working, and you can also get a time of 1/hz.
> 
> nanosleep() is correct but very sloppy for  for long sleeps.  E.g., with
> hz = 1000 on i386(i8254), the average absolute error for a set of perfectly
> calibrated i8254 ticks is about 0.02%, so for sleeps of 1 year the error
> will be at best 1.82 hours on average.  If the hz clock runs faster than
> real time, then nanosleep() wakes up early and does shorter sleeps to
> reach the correct real time, but if the hz clock runs slow then nanosleep()
> normally wakes up hours per year late.  This is easy to fix at a cost of
> efficiency by intentionally underestimating the timeout in ticks, which
> goes well with not adding 1.
> 
> Bruce
> 
Thank you for your very thorough explanation. 
To recap, nanosleep() will always be one tick too late in the synchronous case 
unless a check of sleep time left is implemented in addition to not blindly 
adding 1 to the amount of ticks to wait. To get the minimum latency of 1/hz 
with the current implementation nanosleep() should be called right before the 
next tick fires.

- Pieter