[Solved] Re: kernel trap 12 with interrupts disabled [bge0 on 7.2R]

Martin nakal at web.de
Sat May 30 18:31:06 UTC 2009


Am Fri, 15 May 2009 12:05:47 -0400
schrieb John Baldwin <jhb at freebsd.org>:

> On Friday 15 May 2009 11:38:00 am Martin wrote:
> > Am Fri, 15 May 2009 11:09:20 -0400
> > schrieb John Baldwin <jhb at freebsd.org>:
> > 
> > > x/i please.  The /i decodes it as an instruction so I can see
> > > which registers it was attempting to dereference.
> > 
> > Oh sorry...
> > 
> > (kgdb) x/i 0xffffffff805bbc66
> > 0xffffffff805bbc66 <rt_maskedcopy+6>:	movzbl (%rdx),%edx
> 
> Hmm, your %rdx is garbage. :(
> 
> rdx            0xef3fdf377db53afa       -1207000745686779142
> 
> That should at least be
> 
>                0xffffff..........
> 
> Looks like r9 and r14 have the same odd value.  Normally I would see
> a more obvious breakage such as one of the 'f' nibbles being set to
> '0' or 'e', etc. You could try looking for that odd pointer value in
> the route structure or as arguments to other functions in the stack
> trace to see if you can find a corrupted data structure.

Hi John,

I want to thank you once again. You have been right that the hardware
was broken. I've contacted the hardware support and after replacing
things like memory and mainboard (that haven't been the solution), I
could finally find out that the CPU was broken.

I think, this is why we haven't seen obvious memory failures, like
single unflipped bits and broken patterns, but TOTALLY different
memory contents.

What I have learned from this:

- FreeBSD 7.2R hasn't let me down :)
- memtest or sometimes called memtester is a good utility when you want
  to test memory AND to have high load to heat up the components a
  bit. It is possible, because it runs within the OS. memtest86+ is
  first broken on FreeBSD/amd64 and it does not find anything in my case
  (tested on Linux), because it does not put enough load compared to 3
  parallel memtest processes.
- One thing I noticed on memtest is that it cannot mlock (lock memory)
  a big chunk of memory more than one time (just in one process). And
  mlock above 2GB is a problem. Perhaps, it might be interesting to
  look at this.

Thanks John, you have been a great help for me, our web server is
running stable again.

--
Martin


More information about the freebsd-stable mailing list