default HZ value in 5.2.1
Matthew Dillon
dillon at apollo.backplane.com
Fri Jul 9 09:55:15 PDT 2004
:Chris Stenton wrote:
:> Any reason why the default value for HZ is still set at 100?
:
:Sure. Timer interrupts and context switching aren't free, you know: the
:faster the HZ, the more time the system spends on overhead rather than doing
:useful work.
:
:> Would not 1000 be better for finer granularity?
:
:Certainly it would, but tradeoffs exist....
:
:--
:-Chuck
98% of the things hz is used for do not need a tick faster then 100hz.
If there are some things that do, such as, oh, IF interface polling,
You guys might want to consider implementing a fine-grained general
purpose one-shot/periodic timer abstraction for the kernel. DragonFly
uses just such an abstraction (kern/kern_systimer.c) for all sorts of
things, including separating the scheduler related clock interrupt from
the tick clock and distributing tick, stat, and scheduler interrupts
(and other clock related interrupts) independantly to multiple cpus.
The only thing we do to make things more efficient is to synchronize
the major periodic functions (stat and sched) so they tend to dispatch
at the same time when their harmonics match. Other then that one
optimization, the hard clock interrupts in the system (tick, stat, sched,
and others) are now independant of each other.
DragonFly's SYSTIMER abstraction has an incredibly simple API which
provides both one-shot and periodic 'hard' interrupts targeted to a
particular cpu. In DFly the callbacks work the same as IPI callbacks
in that they are real interrupts and Giant-free, and there is even a
trapframe available (so our hardclock distribution can use it).
The FreeBSD's clock code is a huge mess... everything is interrelated
and integrated together which makes it difficult to make changes without
blowing something up. For example, using a high frequency w/ the 8254
often results in lost ticks and drifting clocks on FreeBSD due to
the way microtime is calculated.. the interval is just too small for
the code to handle the recycle case properly. (and regardless
of your support for other timebases, for which you have a massive
infrastructure in place, the 8254 is still used by many UP machines).
There are three caveats to the DragonFly SYSTIMER code.
First, you need to have a timebase capable of fine-grained variable
interval interrupts. I wound up using the 8254 speaker timer for that
purpose (so there is no longer any 8254 based beep support in DFly).
Since the SYSTIMER code is per-cpu capable and MP friendly, our
intention is to eventually use the LAPIC timer on each cpu.
Second, you need a separate timebase for microtime calculations (e.g.
we use the same 8254 timer FBsd uses, but with the reload set to 65536
instead of the tick interval). At the moment our code assumes the
same frequency but, in fact, the abstraction and methodology is designed
such that the variable interval interrupt can use a timer (e.g. per-cpu
LAPIC timer in the future) that is not necessarily synchronized to
the system timebase. That's important, because it's hard enough to
find one reliable (or at least ntp correctable) timebase and nearly
impossible to find multiple reliable timebases on a PC.
Third, you need to implement the IPI messaging subsystem from DragonFly.
As I've said in the past, nearly every major subsystem in DragonFly now
depends on the IPI messaging module to communicate between cpus, and
this one is no exception. I highly recommend that the IPI messaging
subsystem be ported.
-Matt
More information about the freebsd-current
mailing list