FreeBSD handles leapsecond correctly

Fri Jan 6 10:23:21 PST 2006

:Out of curiosity, what is DragonFly doing with the network timing counters (ie, 
:TCPOPT_TIMESTAMP and the stuff in <netinet/tcp_timer.h>), has that been 
:seperated from HZ too?
:
:I'm pretty sure that setting:
:
:#define TCPTV_MSL       ( 30*hz)                /* max seg lifetime (hah!) */
:
:...with HZ=1000 or more is not entirely correct.  :-)  Not when it started with 
:the TTL in hops being equated to one hop per second...
:
:-- 
:-Chuck

    Well, you know what they say... if it aint broke, don't fix it.  In this
    case the network stacks use that wonderful callwheel code that was 
    written years ago (in FreeBSD).  SYSTIMERS aren't designed to handle
    billions of timers like the callwheel code is so it wouldn't be a
    proper application.

    The one change I made to the callwheel code was to make it per-cpu in
    order to guarentee that e.g. a device driver that installs an interrupt
    and a callout would get both on the same cpu and thus be able to use
    normal critical sections to interlock between them.  This is a 
    particularly important aspect of our lockless per-cpu tcp protocol
    threads.  DragonFly's crit_enter()/crit_exit() together only take 9ns
    (with INVARIANTS turned on), whereas the minimum non-contended inline
    mutex (lwkt_serialize_enter()/exit()) takes around 20ns.

    I don't know what edge cases exist when 'hz' is set so high.  Since we
    don't use hz for things that would normally require it to be set to a
    high frequency, we just leave hz set to 100.

    --

    One side note.  I've found both our userland (traditional bsd4) and
    our LWKT scheduler to be really finicky about being properly woken
    up via AST when a reschedule is required.  Preemption by <<non-interrupt>>
    threads is not beneficial at all since most kernel ops take < 1uS to
    execute major operations. 'hz' is not relevant because it only effects
    processes operating in batch.  But 'forgetting' to queue an AST to
    reschedule a thread ASAP (without preempting) when you are supposed
    to can result in terrible interactive response because you have
    processes winding up using their whole quantum before they realize
    that they should have rescheduled.  I've managed to break this three
    times over the years in DragonFly... stupid things like forgetting a
    crit_exit() or clearing the reschedule bit without actually rescheduling
    or doing the wrong check in doreti(), etc.  The bugs often went unnoticed
    for weeks because it wasn't noticed until someone did some heavily
    cpu-bound work or test.  It is the A#1 problem that you have to look
    for if you have scheduler issues.  All non-interrupt-thread preemption
    accomplishes is to blow up your caches and prevent you from being able
    to aggregate work between threads (which could be especially important
    since your I/O is threaded in FreeBSD).

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>