problem with "cold" hardware? [Was: panic in callout_reset: bad link in callwheel]

Andriy Gapon avg at icyb.net.ua
Wed Jan 28 04:42:34 PST 2009


on 24/01/2009 13:00 Andriy Gapon said the following:
[snip]
> Additional info:
> I recently added some new memory to this system.
> The memory survived several passes of memtest86 before booting to
> FreeBSD. It also survived one pass after the incident.
> Still I wouldn't exclude a possibility of it being bad.

I think that I established that the crash was because of hardware issue.
I had another panic at a different place but with the similar
diagnostics - bad pointer passed to a call. Fortunately, the second time
the pointer was to a well-known long-lived object. So I was able to
compare the bad pointer to an actual address. It turned out that a
single bit was flipped.
Then I realized that in both cases I saw panics after "very cold" boots,
i.e. the system was powered down for more than 1 hour before the boot.
So I performed memtest86 run again, this time also after a long
power-off. And it reported lots of errors.
I restarted memtest86 10 minutes later and then it could not find any
errors in any tests.

Previously I heard about problems with hardware running hot, but not
with it being "cold". I put the word in quotes, because the system is in
a room with normal room temperature.

Any guesses what hardware part might be acting up like this?


-- 
Andriy Gapon


More information about the freebsd-hardware mailing list