FreeBSD handles leapsecond correctly

Sat Jan 7 08:24:18 PST 2006

Matthew Dillon wrote:
> :Luigi Rizzo wrote:
> :> On Sun, Jan 01, 2006 at 10:59:14AM +0100, Poul-Henning Kamp wrote:
> :>>http://phk.freebsd.dk/misc/leapsecond.txt
> :>>
> :>>Notice how CLOCK_REALTIME recycles the 1136073599 second.
> :> 
> :> on a related topic, any comments on this one ?
> :> Is this code that we could use ?
> :> 
> :> 	http://www.dragonflybsd.org/docs/nanosleep/
> :
> :I ported the tvtohz change from Dragonfly back to 4.10 and 5-STABLE here:
> :
> :http://www.pkix.net/~chuck/timer/
> :
> :...so anyone who wants to experiment can try it out.  :-)
> :
> :-- 
> :-Chuck
> 
>     It isn't so much tvtohz that's the issue, but the fact that the
>     nanosleep() system call has really coarse hz-based resolution.  That's
>     been fixed in DragonFly and I would recommend that it be fixed in
>     FreeBSD too.   After all, there isn't much of a point having a system
>     call called 'nanosleep' whos time resolution is coarse-grained and
>     non-deterministic from computer to computer (based on how hz was
>     configured).
> 
>     Since you seem to be depending on fine-resolution timers more and 
>     more in recent kernels, you should consider porting our SYSTIMER API
>     to virtualize one-shot and periodic-timers.  Look at kern/kern_systimer.c
>     in the DragonFly source.  The code is fairly well abstracted, operates
>     on a per-cpu basis, and even though you don't have generic IPI messaging
>     I think you could port it without too much trouble. 
> 
>     If you port it and start using it you will quickly find that you can't
>     live without it.  e.g. take a look at how we implement network POLLING for
>     an example of its use.  The polling rate can be set to anything at
>     any time, regardless of 'hz'.  Same goes for interrupt rate limiting,
>     various scheduler timers, and a number of other things.  All the things
>     that should be divorced from 'hz' have been.
> 
>     For people worried about edge conditions due to multiple unsynchronized
>     timers going off I will note that its never been an issue for us, and
>     in anycase it's fairly trivial to adjust the systimer code to synchronize
>     periodic time bases which run at integer multiples to timeout at the
>     same time.  Most periodic time bases tend to operate in this fashion
>     (the stat clock being the only notable exception) so full efficiency
>     can be retained.  But, as I said, I've actually done that and not
>     noticed any significant improvement in performance so I just don't bother
>     right now.

Matt,

I've been testing network and routing performance over the past two weeks
with an calibrated Agilent N2X packet generator.  My test box is a dual
Opteron 852 (2.6Ghz) with Tyan S8228 mobo and Intel dual-GigE in PCI-X-133
slot. Note that I've run all tests with UP kernels em0->em1.

For stock FreeBSD-7-CURRENT from 28. Dec. 2005 I've got 580kpps with fast-
forward enabled.  A em(4) patch from Scott Long implementing a taskqueue
raised this to 729kpps.

For stock DragonFlyBSD-1.4-RC1 I've got 327kpps and then it breaks down and
never ever passes a packet again until a down/up on the receiving interface.
net.inet.ip.intr_queue_maxlen has to be set to 200, otherwise it breaks down
at 252kpps already.  Enabling polling did not make a difference and I've tried
various settings and combinations without any apparent effect on performance
(burst=1000, each_burst=50, user_frac=1, pollhz=5000).

What suprised me most, apart from the generally poor performance, is the sharp
dropoff after max pps and the wedging of the interface.  I didn't see this kind
of behaviour on any other OS I've tested (FreeBSD and OpenBSD).

-- 
Andre