watchdogs

Alfred Perlstein bright at mu.org
Wed Feb 20 19:41:23 UTC 2013


On 2/20/13 11:36 AM, Eugene M. Zheganin wrote:
> Hi.
>
> I have a bunch of FreeBSDs that hangs (and I really want to do 
> something to fight this). May be it's the zfs or may be it's the pf (I 
> also have a bunch of really stable ones, so it's hard to isolate and 
> tell). Since 9.x hang more often I suppose it's pf. I use ichwd.ko and 
> watchdogd to reboot a machine when it hangs.  It works pretty well; 
> I'm also working on a various WITNESS/INVARIANTS stuff and I'm trying 
> to report it to gnats, but obviously it would be much nicer if the 
> system would panic and leave some debuggable core after a hang (so far 
> I don't have any, so I can only guess). I've read about software 
> watchdog in kernel and I doesn'y quite understand: it's said that 
> kernel software watchdog is able to panic when a deadlock occurs. Can 
> this be achieved with ichwd ? Another one: as far as I understand 
> ichwd reboots my machine on a hardware level, right ? So am I right 
> saying that software watchdog can be, in theory, also deadlocked, 
> thus, being kinda less reliable solution ?
>
Yes all your assumptions are correct.

There is an 'enhanced watchdog' branch that I am working on that offers 
a "pre-watchdog timeout panic".  However since this is done via the 
software you may not get your pre-timeout panic and only have a reboot.

Later revisions may include facilities for generating NMI to trigger 
panic/logs and the followed by a hard reset by external hardware.

Perhaps ichwd offers ability to send NMI?  Let me check sources.

-Alfred


More information about the freebsd-stable mailing list