Random reboots

Jordi Espasa Clofent jordi.espasa at opengea.org
Tue Dec 18 00:15:35 PST 2007

> Great.  I got your other message where you mention this just after I 
> sent mine.  Not trying to hound you :)


> The reason I ask is that I've run into a couple of issues where the 
> machine hangs.  If you were using a watchdog, that would cause the 
> system to reboot.  So as far as debugging goes, it's just as well that 
> you aren't using it.

¿What is the reason that causes watchdog origins a reboot?

> I have run into some issues with snapshots, are you using them?  You 
> might also check the SMART data on your disks since FreeBSD has some 
> bugs where failing drives are not handled gracefully.  See the 
> smartmontools port.

No snapshots here (we use Bacula for backup data). Maybe when ZFS will 
be stable enough for production environment and we'll use it.

We already use SMART in every server; indeed, the HDD health is 
controlled by daily smartmontools shell-script.

> One other idea: you might configure a serial console so you can see any 
> messages the machine generates as it's dying.  (These wouldn't 
> necessarily appear in the log files, since the system is too dead to 
> write to them.) You could connect the serial port to another machine 
> which logs it.

It would be another way to try.

Jordi Espasa Clofent

