5.3 in diskless cluster: irregular reboots at 14:09 hr. ?!?!
spamrefuse at yahoo.com
Wed Dec 29 02:06:47 PST 2004
Colin J. Raven wrote:
> On Dec 29, Rob launched this into the bitstream:
>> I'm running 5.3-Stable on all PC's.
>> I have a master/router with 7 diskless slaves. One of the
>> slaves shows irregular reboots, without a trace, not even
>> a shutdown message in the logs.
>> Until now I have the following sudden reboots of one particular
>> slave happen:
>> Nov. 16 14:09:41
>> Nov. 30 14:09:23
>> Dec. 28 14:09:34
>> Each is exactly at the same time; this is rather peculiar, isn't it?
>> Any idea what's going on here, or how to trace this problem?
> What *else* is happening at (or immediately before) 14:09 on this
> machine?? For example is something rather intense occurring immediately
> beforehand? I'm thinking power supply failure when it get's loaded
> beyond a certain point...so, pursuant to that is there maybe a big log
> grep happening beforehand, or some other event that stresses components,
> thus consuming more power?
Thank you Colin.
What would be a good command to run, to find out how stressful the
PC is right before the reboot? Is 'top' good enough? Or is there
something better? 'ps auxw' for example?
Since I don't know on what date it happens a next time, I will start
a cron job each day at 14:08 to check how stressful the PC is. It will
output the result of the job to disk.
> It has that funny; "I'll bet the PSU is on the way out" feeling to it,
> but actually proving that can be tedious.
I may also swap UPS between two slaves and see if the reboots are
related to a shaky UPS. I don't want to replace the PSU yet :(.
More information about the freebsd-questions