Xserve G4 stability (random processes crashing)

Kevin Day toasty at dragondata.com
Sun Apr 4 01:15:14 UTC 2010


>> If anything, it seems worse on -RELEASE than -STABLE.  In -STABLE I was at least able to get through a buildworld with only restarting it once, and now in -RELEASE I've restarted about 10 times and still haven't made it all the way through.
>> 
>> Same symptoms as before, gcc giving internal compiler errors, segfaults, or corrupt .o files being produced.  Memtester (even running in parallel with buildworld) never reports any errors. I'll keep fiddling with this, but if anyone has any suggestions on where to look for some clues, it'd be appreciated.
>>  
> Since you say UP kernels have the same problems, other G4 machines seem not to have issues, and SMP G5 Xserves are completely stable, that points at some G4 Xserve-specific piece of hardware. I'd guess the ATA controller. Could you try chroot to an NFS volume mounted from a known-stable machine, or a USB or Firewire disk, and trying the same things?
> -Nathan


I think you may be on to something... trying to copy /usr/src over to an NFS mount, I got:

ad0: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly
ad0: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly
ad0: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly
ad0: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly
ad0: WARNING - SET_MULTI taskqueue timeout - completing request directly
ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=25113024

This was repeating slowly over and over on the console with LBA changing each time.

I'm going to do some more fiddling, but it does look ata related now.




More information about the freebsd-ppc mailing list