select/poll/usleep precision on FreeBSD vs Linux vs OSX
Bruce Evans
brde at optusnet.com.au
Thu Mar 1 03:14:18 UTC 2012
On Thu, 1 Mar 2012, Luigi Rizzo wrote:
> On Thu, Mar 01, 2012 at 11:33:46AM +1100, Bruce Evans wrote:
>> On Wed, 29 Feb 2012, Luigi Rizzo wrote:
>>> | Actual timeout
>>> | select | poll | usleep|
>>> timeout | FBSD | Linux | OSX | FBSD | FBSD |
>>> usec | 9.0 | Vbox | 10.6 | 9.0 | 9.0 |
>>> --------+-------+-------+--------+-------+-------+
>>> 1 2000 99 6 0 2000
>>> 10 2000 109 15 0 2000
>>> 50 2000 149 66 0 2000
>>> 100 2000 196 133 0 2000
>>> 500 2000 597 617 0 2000
>>> 1000 2000 1103 1136 2000 2000
>>> 1001 3000 1103 1136 2000 3000 <---
>>> 1500 3000 1608 1631 2000 3000 <---
>>> 2000 3000 2096 2127 3000 3000
>>> 2001 4000 3000 4000 <---
>>> 3001 5000 4000 5000 <---
>>>
>>> Note how the rounding (poll has the timeout in milliseconds) affects
>>
>> You must have synced with timer interrupts to get the above. Timeouts
>
> yes i have -- the test code does almost nothing after returning from
> a select, on a system that does some amount of work times could be
> up to 1000us shorter. Still a huge error on short timeouts.
I get the sync but not the rounded timeouts, on my ~5.2 kernel with
HZ = 100. The times are typically 19900-19993 for rounding up 1 us
to 2 ticks.
> I should also comment that these are average values on an otherwise
> idle system -- i will try to post a histogram of the actual values,
> it might well be that osx and linux have quantized values very
> different from the average (though this would violate the specs,
> so i suspect instead that they have some cheap one-shot timers).
>
> For FreeBSD I have also rounded the bsd values (actual averages are -1/+3us
> over 1sec experiments).
Oh. The jitter is of minor interest, and rounding to usec should show
an average of slightly less than the timeout rounded up to ticks (on
an unloaded system).
Bakul Shah confirmed that Linux now reprograms the timer. It has to,
for a tickless kernel. FreeBSD reprograms timers too. I think you
can set HZ large and only get timeout interrupts at that frequency if
there are active timeouts that need them. Timeout granularity is still
1/HZ.
Hmm, this may explain why you are getting exact n000's -- every time
you ask for a timeout, you get one n000 us later (on a near-idle machine
where nothing else is asking for many timeouts), while old kernels
give timeouts on perfectly periodic n000(+error) boundaries; now when
the syscall is made just after a boundary, the boundary for the timeout
is never a full n000 away. There may be a lot of jitter for both, but
if the reprogramming of the timer when you ask for a new timeout is
too smart, then the jitter will average out to 0, giving perfect n000's.
Try running multiple sources of new timeouts. I think a periodic
itimer should produce perfectly periodic ones with little overhead.
Then other timeouts should not change the periodicity or even
reprogram the timer.
Reprogramming on demand seems to give unwanted aperiodicity: you ask for
a delay of 1 and get 2000. Suppose you actually want 2000, and actually
get it relative to the request time. Then the timer must be interrupting
aperiodically, with an average period of 2000+(overhead time of say 2)
possibly with large jitter. So 500 of these take 1 second plus 1000 us,
plus any jitter (the jitter may be negative, but is most likely positive,
since when the process setting up the timeouts is preempted and nothing
else is setting them up, there may be a large additional delay).
I try to avoid this problem in my version of ping. I try to send a packet
on every 1 second boundary. Normal ping tries to send one 1 second after
the previous one, but it can't do this since it has overheads and gets
preempted. With HZ=100 and rounding up and adding 1, the drift is likely
to be 20 msec every second or 2%. This is quite a lot. My version tries
to schedule a timeout that expires exactly 1 second after the previous
packet was sent, not 1 second after the current time. It takes a simple
subtraction to determine the timeout to reach the next seconds boundary,
but determining the times to subtract seems to require an extra
gettimeofday() call. I should use a periodic itimer and depend on it
actually being periodic. The kernel must do similar things to keep
periodic itimers actually periodic after it reprograms timers. There
may be a lot of jitter on each reprogramming, but this can be compensated
for on average. OTOH, as for skewing clocks, the compensation shouldn't
go too fast in either direction. This could get complicated. I don't
know what -current actually does.
Bruce
More information about the freebsd-arch
mailing list