SW_WATCHDOG vs new eventtimer code

Andriy Gapon avg at FreeBSD.org
Tue Sep 20 20:28:21 UTC 2011


on 20/09/2011 23:04 Alexander Motin said the following:
> Hi.
> 
> On 20.09.2011 22:19, Andriy Gapon wrote:
>> just want to check with you first if the following makes sense.
>> I use SW_WATCHDOG on one of the test machines, which was recently updated to
>> from stable/8 to head.  Now it seems to get seemingly random watchdog events.
>> My theory is that this is because of the eventtimer logic.
>> If during idle period we accumulate enough timer ticks and then run all those
>> ticks very rapidly, then the SW_WATCHDOG code may get an impression that it was
>> not patted for many real ticks.
>> Not sure what would be the best way to make SW_WATCHDOG happier/smarter.
> 
> Eventtimer code now set to generate interrupts at least 4 times per
> second for each CPU. As soon as SW_WATCHDOG only handles periods more
> then one second, I would say it should not be hurt. I would try to add
> some debug there to see what's going on (how big the tick busts are).
> I'll try it to do it tomorrow.

Just in case, here is a debugging snippet from a panic that I've got:
...
#12 0xffffffff80425d80 in watchdog_fire () at /usr/src/sys/kern/kern_clock.c:858
#13 0xffffffff8042603e in hardclock_anycpu (cnt=15761, usermode=Variable
"usermode" is not available.
) at atomic.h:183
#14 0xffffffff80660ae5 in handleevents (now=0xffffff80e3e0b8b0, fake=0) at
/usr/src/sys/kern/kern_clocksource.c:209
#15 0xffffffff80661b48 in timercb (et=Variable "et" is not available.
) at /usr/src/sys/kern/kern_clocksource.c:379
#16 0xffffffff802cc068 in hpet_intr_single (arg=Variable "arg" is not available.
) at /usr/src/sys/dev/acpica/acpi_hpet.c:258
#17 0xffffffff802cc71e in hpet_intr (arg=0xffffff80e3e0b5b0) at
/usr/src/sys/dev/acpica/acpi_hpet.c:276
#18 0xffffffff80444b02 in intr_event_handle (ie=0xfffffe0002751500,
frame=0xffffff80e3e0ba30) at /usr/src/sys/kern/kern_intr.c:1428
#19 0xffffffff8062f920 in intr_remove_handler (cookie=0xffffff80e3e0b5b0) at
/usr/src/sys/amd64/amd64/intr_machdep.c:197
#20 0xffffffff8069cca9 in lapic_enable_pmc () at
/usr/src/sys/x86/x86/local_apic.c:431
#21 0xffffffff8062cc70 in Xapic_isr2 () at apic_vector.S:87
#22 0xffffffff80443118 in intr_event_execute_handlers (p=0xfffffe0002758000,
ie=0xfffffe0002a5eb00) at /usr/src/sys/kern/kern_intr.c:1244
#23 0xffffffff80444164 in ithread_loop (arg=0xfffffe0002758000) at
/usr/src/sys/kern/kern_intr.c:1269
#24 0xffffffff8044053a in fork_exit (callout=0xffffffff80444024
<intr_event_add_handler+1029>, arg=0xfffffe0002b4f700, frame=0xffffff80e3e0bc50)
    at /usr/src/sys/kern/kern_fork.c:1024
#25 0xffffffff8062cb0e in Xint0x80_syscall () at ia32_exception.S:62
#26 0x0000000000000000 in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) fr 14
#14 0xffffffff80660ae5 in handleevents (now=0xffffff80e3e0b8b0, fake=0) at
/usr/src/sys/kern/kern_clocksource.c:209
209             while (bintime_cmp(now, &state->nextstat, >=)) {
(kgdb) list
204             }
205             if (runs && fake < 2) {
206                     hardclock_anycpu(runs, usermode);
207                     done = 1;
208             }
209             while (bintime_cmp(now, &state->nextstat, >=)) {
210                     if (fake < 2)
211                             statclock(usermode);
212                     bintime_add(&state->nextstat, &statperiod);
213                     done = 1;
(kgdb) p state->nextstat
$1 = {sec = 90, frac = 15986939599958264124}
(kgdb) p *now
$3 = {sec = 106, frac = 11494276814354478452}
(kgdb) p statperiod
$4 = {sec = 0, frac = 145249953336295682}

(kgdb) fr 13
#13 0xffffffff8042603e in hardclock_anycpu (cnt=15761, usermode=Variable
"usermode" is not available.
) at atomic.h:183
183     atomic.h: No such file or directory.
        in atomic.h
(kgdb) p cnt
$5 = 15761
(kgdb) p newticks
$6 = 15000
(kgdb) p watchdog_ticks
$7 = 16000

Watchdog timeout was set to ~16 seconds.

-- 
Andriy Gapon


More information about the freebsd-hackers mailing list