4.8 "Alternate system clock has died" error

Tue Nov 22 09:35:36 GMT 2005

Charles Sprickman wrote:
> On Tue, 22 Nov 2005, Uwe Doering wrote:
>> Charles Sprickman wrote:
>>> [...]
>>> So it certainly looks easy enough for me to change the first two 
>>> sections of the diff referenced above, but I'm having issues finding 
>>> that last one in cpu_initclocks().  It looks like that section really 
>>> has changed quite a bit. (see v.1.206)
>>
>> Just look for all instances of
>>
>>  writertc(RTC_STATUSB, rtc_statusb);
>>
>> and put
>>
>>  rtcin(RTC_INTR);
>> directly behind them (into the next line).  There should be three of 
>> them, in 4.8 as well as RELENG_4 and CURRENT.
> 
> I found the first two (line 721 and 979) and I see third at line 1064.
> 
> One other question, looking at the initial patch 
> (http://www.freebsd.org/cgi/query-pr.cgi?pr=17800), I see that they 
> followed a slightly different line:
> 
>     /* Initialize RTC. */
>      writertc(RTC_STATUSA, rtc_statusa);
>      writertc(RTC_STATUSB, RTCSB_24HR);  <<<---
> +    rtcin(RTC_INTR); /* clear any pending interrupt */
> 
> Should I worry about that at all?

No.  User supplied patches in PRs aren't necessarily 100% correct.  In 
this case the PR submitter clears pending interrupts while interrupt 
generation is disabled.  However, the committer of 1.214 (John Baldwin, 
in fact) thought that this is wrong because you have to clear pending 
interrupts after interrupt generation has been re-enabled, or else you'd 
get a race condition.  And I agree with that.

>>> Is there any interest in moving this back to 4-STABLE?
>>
>> Interest there is, I suppose.  Plenty of people still run 4.x.  The 
>> question is rather whether there is any committer willing to do the 
>> backport.  As far as I can tell, most of them are more focused on 
>> newer branches.  Perhaps we need special backporting committers for 
>> legacy branches.  Just a thought. ;-)
> 
> Yeah, I work on a total of about 32 boxes, all still at either 4.8 or 
> 4.11.  Committers have to know C, right? :)

Not only that.  In case of kernel issues they also have to have quite a 
lot of knowledge and experience in kernel hacking.  If you botch a 
kernel patch and don't notice in time you'll likely cause quite a lot of 
grief for the users.

> [...]
> If anyone wants to satisfy my curiousity about this whole clock issue, 
> I'd be grateful.  A few questions:
> 
> RTC = "CMOS clock"?

Right.

> Does the RTC supply all timing in a system, or just "time of day"?

Just the latter (once at boot time) and also the RTC interrupt 
('stathz').  Perhaps 'profhz' too, if you enable profiling.  Can't tell 
offhand.

> How does this line relate to things? kern.clockrate: { hz = 100, tick = 
> 10000, tickadj = 5, profhz = 1024, stathz = 128 }
> If the RTC doesn't supply the base timing for things like all the I/O 
> busses, processor, what does?

There's another clock device, which 'hz' is derived of, for example. 
Perhaps one can call it the CPU clock.  It drives the CPU, memory, PCI 
bus etc.  Also, while the system is running it drives the kernel's 
time-of-day clock.  However, the CPU clock's frequency isn't overly 
accurate.  That's why the kernel's time-of-day clock usually deviates 
pretty much from the wall clock unless you apply some form of continuous 
adjustment, for instance NTP.

    Uwe
-- 
Uwe Doering         |  EscapeBox - Managed On-Demand UNIX Servers
gemini at geminix.org  |  http://www.escapebox.net