Recent problems with 6-STABLE...

John Baldwin jhb at freebsd.org
Fri Feb 1 14:29:22 PST 2008


On Thursday 31 January 2008 11:12:40 pm gnn at freebsd.org wrote:
> At Thu, 31 Jan 2008 06:17:16 -0500,
> John Baldwin wrote:
> > 
> > On Thursday 31 January 2008 04:37:13 am gnn at freebsd.org wrote:
> > > At Tue, 29 Jan 2008 11:57:39 -0500,
> > > John Baldwin wrote:
> > > > 
> > > > On Tuesday 29 January 2008 07:32:16 am gnn at freebsd.org wrote:
> > > > > Hi,
> > > > > 
> > > > > I have two boxes running 6-STABLE, post 6.3 release, which have both
> > > > > spontaneously rebooted, one under load and one not under load.  I 
have
> > > > > attached dmesg and some traceback information, from the one trace 
that
> > > > > looked interesting.  Any thoughts or hints would be apprecated.
> > > > > 
> > > > > To save you scanning all the dmesg first these are dual processor 
XEON
> > > > > boxes, each processor has 4 cores.
> > > > 
> > > > Can you do 'x/i 0xffffffff80296642' to show which instruction faulted?
> > > 
> > > (kgdb) x/i 0xffffffff80296642
> > > 0xffffffff80296642 <pfs_exit+114>:      cmp    %ecx,0x8(%rdx)
> > 
> > Hmm, and rdx from your last post was:
> > 
> > > printf "%x\n" 32491047111385957
> > 736e6f69746365
> > 
> > > echo "0x73 0x6e 0x6f 0x69 0x74 0x63 0x65" | dh
> > snoitce
> > 
> > so it appears you have a data corruption issue.  You could check the
> > hardware (RAM, etc.) but if that is ok you might want to see if you
> > can isolate it to a specific driver if a driver has a bug (or
> > hardware has an errata we don't work around yet).  Do you have any
> > custom drivers for hardware that does DMA?  If not, which storage
> > driver (including pciconf output if ATA) and NIC(s) does this box
> > have?  Also, how much RAM?
> 
> Custom drivers?  Not that I know of.  This box uses Intel Pro/1000
> network drivers and Adaptec AIC7902 SCSI for talking to the disks.
> 
> The box has 8G of RAM in 2G chunks (which has now been subjected to 40
> memtests and passed).

Try hw.physmem=4g at the loader to see if it fixes it.  If so, it's a bug with 
bounce buffering.

-- 
John Baldwin


More information about the freebsd-amd64 mailing list