>> These are very rare.... except they seem to happen about once a day 
>> for a
>> while and then stop... very strange..
>>> and usually caused by hardware problems (e.g. faulty power supply,
>>> overheating CPU, bad RAM).
>> Possible, but if so, the hardware fixed itself on the first two boxes I
>> mentioned.
>All of this can be bad, or not quite bad -- just not healthy -- 
>hardware.  Say a power supply that can't supply reliable +5, when the 
>line voltage drops a tad while all the disks are being hammered.  It 
>can be a nightmare to figure out.  Setup crash dumps, but also make 
>sure that the UPS the box is attached to isn't having problems.  If 
>it's not on conditioned power, fix  that.

Also, a lot of older UPSes do not have any AVR (automatic voltage
regulation).  This in conjunction with a marginal power supply can
cause problems like you describe.  One of our POPs are in an area that
has seen tremendous residential and industrial growth putting a strain
on the local power. Prior to some major upgrades from the local
utility company, we would see street power dropping below 100V during
peak usage coming from the street and our APCs that have "smart boost"
would all kick in to compensate.  Also, the UPS can just be "bad" over

As others have said, its pretty rare that reboots do not leave a crash
dump behind when its a software issue. At the very least, enable crash
dumps on your machines in question. See the man page for dumpon. At
least this way you can narrow down the odds as to whether or not its
pointing to a hardware or software issue.


