FreeBSD handles leapsecond correctly
Matthew Dillon
dillon at apollo.backplane.com
Fri Jan 6 10:23:21 PST 2006
:Out of curiosity, what is DragonFly doing with the network timing counters (ie,
:TCPOPT_TIMESTAMP and the stuff in <netinet/tcp_timer.h>), has that been
:seperated from HZ too?
:
:I'm pretty sure that setting:
:
:#define TCPTV_MSL ( 30*hz) /* max seg lifetime (hah!) */
:
:...with HZ=1000 or more is not entirely correct. :-) Not when it started with
:the TTL in hops being equated to one hop per second...
:
:--
:-Chuck
Well, you know what they say... if it aint broke, don't fix it. In this
case the network stacks use that wonderful callwheel code that was
written years ago (in FreeBSD). SYSTIMERS aren't designed to handle
billions of timers like the callwheel code is so it wouldn't be a
proper application.
The one change I made to the callwheel code was to make it per-cpu in
order to guarentee that e.g. a device driver that installs an interrupt
and a callout would get both on the same cpu and thus be able to use
normal critical sections to interlock between them. This is a
particularly important aspect of our lockless per-cpu tcp protocol
threads. DragonFly's crit_enter()/crit_exit() together only take 9ns
(with INVARIANTS turned on), whereas the minimum non-contended inline
mutex (lwkt_serialize_enter()/exit()) takes around 20ns.
I don't know what edge cases exist when 'hz' is set so high. Since we
don't use hz for things that would normally require it to be set to a
high frequency, we just leave hz set to 100.
--
One side note. I've found both our userland (traditional bsd4) and
our LWKT scheduler to be really finicky about being properly woken
up via AST when a reschedule is required. Preemption by <<non-interrupt>>
threads is not beneficial at all since most kernel ops take < 1uS to
execute major operations. 'hz' is not relevant because it only effects
processes operating in batch. But 'forgetting' to queue an AST to
reschedule a thread ASAP (without preempting) when you are supposed
to can result in terrible interactive response because you have
processes winding up using their whole quantum before they realize
that they should have rescheduled. I've managed to break this three
times over the years in DragonFly... stupid things like forgetting a
crit_exit() or clearing the reschedule bit without actually rescheduling
or doing the wrong check in doreti(), etc. The bugs often went unnoticed
for weeks because it wasn't noticed until someone did some heavily
cpu-bound work or test. It is the A#1 problem that you have to look
for if you have scheduler issues. All non-interrupt-thread preemption
accomplishes is to blow up your caches and prevent you from being able
to aggregate work between threads (which could be especially important
since your I/O is threaded in FreeBSD).
-Matt
Matthew Dillon
<dillon at backplane.com>
More information about the freebsd-current
mailing list