select/poll/usleep precision on FreeBSD vs Linux vs OSX
Bruce Evans
brde at optusnet.com.au
Thu Mar 1 05:42:48 UTC 2012
On Thu, 1 Mar 2012, Bruce Evans wrote:
> On Thu, 1 Mar 2012, Bruce Evans wrote:
>
>> ...
>> Bakul Shah confirmed that Linux now reprograms the timer. It has to,
>> for a tickless kernel. FreeBSD reprograms timers too. I think you
>> can set HZ large and only get timeout interrupts at that frequency if
>> there are active timeouts that need them. Timeout granularity is still
>> 1/HZ.
>
> I tried this in -current and in a 2008 -current with hz=10000. It worked
> mediocrely:
> - the 2008 version gave lapic cpuN: timer interrupts on all CPUs at
> frequency of almost exactly 10 kHz. This is the behaviour before
> FreeBSD reprogrammed timers (except the frequency is often off by
> as much as 10% due to calibration bugs). There were many anomolies
> in the results from the test program (like select() adding 199 usec
> and usleep() adding 999 usec).
> - [... no surprises in -current]
I tried this in -current with hz=100000. This gives (some not very
surprising) behaviour:
- systat claims ~100% idle, but the ~100k interrupts on 1 CPU actually
reduces performance by 33% (two CPUs take 30 seconds user time to
do what can be done in 20 seconds user time with hz=100). This is
a normal problem with fast interrupt handlers. They need a faster
interrupt handler to account for them properly.
- ./prog 1 select works reasonably. It reports timeouts of 29-30 us.
I expected 19-20.
- ./prog 1 poll is broken as we know. It asks for timeouts of 0 and
takes 3 us.
- ./prog 1 usleep shows brokenness. It reports timeouts of 999 us.
I think this is due to getnanouptime()'s brokenness.
$(sysctl kern.timecounter.tick) is 100. This reduces getnanouptime()'s
accuracy back to to 1 msec, which explains the 999 us. But why doesn't
select() have the same problem? select() uses getmicrouptime(), but
it has the same brokenness. The sysctl is r/o, so I couldn't use
it easily. I have changed tc_tick using ddb before, but don't want
to risk reducing it by a factor of 100. The timecounter update
algorithm depends on the timehands not being recycled too fast, and
probably couldn't copy with recycling 100 times faster.
- ./prog 1000 select and ./prog 1000 poll take 20 us extra. I expected
9-10 extra.
- ./prog 1000 usleep takes 619-693 us extra. Not the full extra 100
ticks from getnanouptime() fuzziness now.
- ./prog 500000 usleep takes 500026-500885 us. Even higher variance
which agrees with the fuzziness better. select and poll with this
timeout still have accuracy and low variance (21-26 us extra).
The fuzzy versions are actually useful for optimization after all:
- for long timeouts, use the fuzzy versions and accept their inaccuracies.
Sleep longer by the amount fuzziness so that sleeps are never too
short.
- for short timeouts, it seems necessary for the initial timestamp to
be accuarate. When checking if the timeout has expired, first try a
fuzzy check. This is sufficent if the current fuzzy time is far from
the expiry time.
Bruce
More information about the freebsd-arch
mailing list