Instable machine; hardware or not?

Lowell Gilbert freebsd-questions-local at be-well.ilk.org
Mon Oct 30 19:10:02 UTC 2006


Ronald Paul <ronald at jesdesign.nl> writes:

> I have a small server (AMD XP 2400+, ASRock K7VM4+lan, no ECC) running
> 4.9-RELEASE since February 2004. It is being used for some small
> dynamic websites (FAMP), e-mail and some other small stuff. It got an
> uptime of 400+ days last year but since a few months, the machines
> seems to get more and more unstable.
>
> Seemingly random signals (most of them 11, some 10 and 6) are causing
> random processes (including bash, cron, named, adjkernts, inetd,
> syslogd and sh) to exit. So this cannot be something else than faulty
> hardware, you would think. But, and this is the strange part for me,
> these instabilities are somehow triggered because when the machine is
> restarted, the server seems rock-solid for the first week. I then can
> compile a kernel without problems.
>
> Temperatures and voltages are fine:
>> # healthd -d
>> Temp.= 38.0, 21.5,  0.0; Rot.= 3629,    0,    0
>>  Vcore = 1.73, 0.00; Volt. = 3.28, 4.95, 11.55, -10.55, -4.56
>
> I already swapped memory and disk but this behavior keeps the same. Is
> there any possibility that this crashes would disappear when switching
> to 6.1-RELEASE or are these problems solely caused by hardware? If so,
> is there any indication on to what hardware-component I should look?
> I'm planning to switch motherboards but since it is quite a drive to
> our co-location facility and because it is still functioning as
> production-server and we do not have much failsafe-services yet, I
> want to think twice.

Yes, it's probably a hardware problem, and yes, it will probably be
hard to prove that.  Assuming your time has some value, I would
recommend replacing the whole machine; that way, you can have it set
up and tested before moving it out on location.


More information about the freebsd-questions mailing list