Re-enginer the wheel: a rejuvenation of BSD callout(9) and timer facilities - report weeks 5-8

Davide Italiano davide at freebsd.org
Mon Jul 16 17:36:44 UTC 2012


These weeks I've accomplished a fair amount of tasks:

Week 5:
- Modify a bit the callout(9) public KPI to avoid code duplication and
 breakages, as well as the sleepqueue(9) one
- Event aggregation initial implementation. Augment the callout
structure so that consumers other than actual time at which callout
should fire may
specify a tolerance interval. Rather than looking for the next callout
event in callout_tick() determine a range [t-delta;t'+delta'] deriving
it from the tolerance parameter specified by clients so that's
suitable for a given number of events, and schedule an interrupt in
the middle of such range.

Week 6:
- Experiment a new  approach used for low-precision events, try to
align them to some time borders on insert. This approach can make
system load more bursty, but it is very cheap to be implemented and
may be quite effective. Moreover, it can easily coexist with the
previously implemented "real-time aggregation".
- General polishing of the code as suggested by mav@ and bde@

Week 7:
- Add a new CALLOUT_PROFILING option so that SYSCTLs on the wheel may
be selectively disabled/enabled.  Selectively disabling this sort of
rudimentary
profiling may have a good effect on CPU caches because same variable
is not accessed anymore by different CPUs.
- Fix a bug in the 'steps' variable logic in softclock(). It shouldn't
be zeroed every time we extract a new event for processing it from
cc_expireq.
- Take in account aggregation when comparing event times in
callout_process() and when we submit events to eventtimers(4).

Week 8:
- Enable execution of callout from hw interrupt context rather than sw
interrupt context for kern_nanosleep() and seltdwait(). This change
improves precision for select()/poll()/usleep()/nanosleep() services.
- Fix an issue related to old periodic timers. The code in
kern_clocksource.c uses interrupt to keep track of time, and this time
may not match with binuptime(). In order to address such incoerency,
switch periodic timers to binuptime(). While here, modify
callout_process() so that it takes present time as argument avoiding
to call binuptime() twice even though it's not strictly needed.
- Make the interval timings for EVFILT_TIMER more accurate.

Next step (for the next week or two) will be implement interrupt
compensation in order to achieve even more better precision than now.
Indeed this work is sensitive and require proper design so now I'm
discussing with my mentor about a proper way to implement.
The code has been proposed on freebsd-arch mailing list, and I got
some useful comments.
Also, Florian Smeets (flo@) offered to benchmark my changes vs HEAD so
we'll soon have some results.

Davide


More information about the soc-status mailing list