Kernel time-keeping adjustments - how to tune?

Mon Jan 17 20:29:52 PST 2005

On Mon, Jan 17, 2005 at 05:17:12PM -0700, Danny MacMillan wrote:
> On Mon, Jan 17, 2005 at 03:12:46PM -0600, John wrote:
> > On Mon, Jan 17, 2005 at 03:05:22PM -0600, Kevin Kinsey wrote:
> > > John wrote:
> > > 
> > > >OK - on my FreeBSD 5.3-STABLE system, as I have documented (cf:
> > > >message thread Re: ntpd problems since upgrading to 5.3), ntpd
> > > >won't run, even with an identical configuation to the 5.2.1 system
> > > >next to it.  Furthermore, when I run adjkerntz -a, it totally whacks
> > > >the system's ability to keep time - it races forward at quite a
> > > >high rate.  ntpdate runs, and sets the time correctly.
> > > >
> > > >At one point, something managed to get the timekeeping parameters
> > > >pretty near normal - less than a second of drift per hour (much
> > > >better than the 40% rate it is now - it gains about 24 seconds PER
> > > >MINUTE).  Then I ran adjkerntz -a again, just to see if it really
> > > >was the culprit.  It does seem that it is adjkerntz that is causing
> > > >(or triggering) the problem, but now I can't get the system back
> > > >to a decent time-keeping rate.  Whatever it was I stumbled across
> > > >before, I'm not finding it again now.
> > > >
> > > >Now, it doesn't appear that adjkerntz itself has changed in YEARS,
> > > >so it must be some change in the system call operation, parameters,
> > > >or data structures that is causing this.
> > > >
> > > >So - since I don't seem to be able to stumble across what I did
> > > >right before to get the timekeeping somewhat near normal, I am
> > > >wondering if there's a manual way to reach them.
> > > 
> > > I read through the cited thread, and don't see any replies;
> > > nor do I see enough explanation to give you any magic
> > > beans.  Of course, I'm no one's fairy godmother...
> > 
> > LOL!  No - I don't expect you to be - that'd take ALL the challenge
> > out of it!
> > 
> > > > the clock on my 5.3-STABLE system is RACING.
> > > > It is going at almost twice as fast as real time.
> > > 
> > > 
> > > Hmm, that might mean something.  What do you get from:
> > > 
> > > sysctl -a | grep timecounter
> > 
> > I don't know what all this means, but here it is...
> > kern.timecounter.stepwarnings: 0
> > kern.timecounter.nbinuptime: 37254938
> > kern.timecounter.nnanouptime: 0
> > kern.timecounter.nmicrouptime: 3040
> > kern.timecounter.nbintime: 19671985
> > kern.timecounter.nnanotime: 2982761
> > kern.timecounter.nmicrotime: 16689224
> > kern.timecounter.ngetbinuptime: 0
> > kern.timecounter.ngetnanouptime: 318046
> > kern.timecounter.ngetmicrouptime: 14256461
> > kern.timecounter.ngetbintime: 0
> > kern.timecounter.ngetnanotime: 0
> > kern.timecounter.ngetmicrotime: 3461614
> > kern.timecounter.nsetclock: 87
> > kern.timecounter.hardware: TSC
> > kern.timecounter.choice: TSC(800) i8254(0) dummy(-1000000)
> > kern.timecounter.tick: 1
> > 
> > Are these all documented somewhere?  I'm sure they must be, but
> > I don't know where to look...
> > 
> > > ??
> > > 
> > > IANAE, but I wonder if ntpd is going to be able to sync
> > 
> > Well, maybe you will be soon.  An "expert" is anyone who makes
> > three consecutive correct guesses on the same topic... :)
> > 
> > > up until the local clock runs realistically....
> > 
> > Well, I thought of that, too, and during the period between when I
> > had it running decently and before I decided to try (all too
> > successfully) to recreate the problem with adjkerntz, I did
> > try ntpd again, but with the same results.  It simply acted like
> > it could not see the server.
> 
> There are limits to ntpd's ability.  It can't correct for an inaccuracy
> in the local timecounter of more than around 500 ppm.  I'm not sure
> what happens when the actual offset is 400 000 ppm ... it could be the
> behaviour you're seeing.
> 
> I see that Kevin has already asked about timecounter.  I was about to
> suggest the same.  My server at work had much the same problem as yours
> (the clock raced at nearly double the proper speed).  I resolved the
> issue by instructing the server to use an alternative timecounter.  I
> put the following in my /etc/sysctl.conf and rebooted (this is on a
> 5.1 system):
> 
> kern.timecounter.hardware=i8254
> 
> In my case, the bogus timecounter was one labelled "ACPI-Safe" in the
> dmesg output.  I would try the i8254 if I were you, since it looks like
> it's the only other option available.  Hopefully the i8254 will fall
> within the operating tolerances of ntpd.

Awright!!!!

I don't know how this happens, but using TSC(whatever that is),
the clock skew is all over the dang map!  With the i8254, it's VERY
stable.

It didn't help with ntpd, which is still saying
Jan 17 22:26:17 pearl ntpd[1897]: ntpd 4.2.0-a Sun Jan  9 10:58:59 CST 2005 (1)
Jan 17 22:26:17 pearl ntpd[1897]: bind() fd 7, family 2, port 123, addr 0.0.0.0, in_classd=0 flags=8 fails: Address already in use
even though netstat doesn't show anything bound to port 123 before
I start it.

But I'm not gaining or loosing 40% of real time randomly anymore!

That's good!
-- 

John Lind
john at starfire.MN.ORG