mps0-troubles

Kenneth D. Merry ken at freebsd.org
Tue Feb 8 20:13:12 UTC 2011


On Tue, Feb 08, 2011 at 02:35:35 +0100, Joachim Tingvold wrote:
> On Fri, Feb 04, 2011, at 19:00:11PM GMT+01:00, Kenneth D. Merry wrote:
> >Perhaps it could depend on memory fragmentation somewhat.  Over time  
> >you
> >may see the low water mark go down a bit.
> >
> >The good news is that it doesn't look like we have a leak.
> 
> <http://home.komsys.org/~jocke/dmesg_mps0_freebsd-scsi_5.txt>
> 

This particular error is interesting:

mps0: (0:40:0) terminated ioc 804b scsi 0 state c xfer 0
mps0: (0:40:0) terminated ioc 804b scsi 0 state c xfer 0

It means that the chip terminated the command for some reason.  I have been
talking to LSI about it.  I'm working on getting an analyzer trace when it
happens, so I cn send that to LSI.

What kind of expander do you have in your system?  How many expanders do
you have?  How many drives do you have?  Can you send 'camcontrol devlist
-v' output?

> [jocke at filserver ~]$ sysctl hw.mps.0
> hw.mps.0.debug_level: 0
> hw.mps.0.allow_multiple_tm_cmds: 0
> hw.mps.0.io_cmds_active: 1
> hw.mps.0.io_cmds_highwater: 959
> hw.mps.0.chain_free: 2048
> hw.mps.0.chain_free_lowwater: 1721
> hw.mps.0.chain_alloc_fail: 0
> 
> This time I did a recursive copy of a folder with no large files at  
> all (it contained only small documents), from 'storage' to 'storage'.
> 
> However, it recovered, so the copy just continued where it left of --  
> which is a change from previous crashes.

Yes, it looks like we're not running into the out of chain problem.

The timeouts could be due to all sorts of problems.  The IOC terminated
errors I'm still not sure about.  I need to get a trace and send that along
with a diagnostic ring buffer dump from the card to LSI to get some answers
about what is going on.

Ken
-- 
Kenneth Merry
ken at FreeBSD.ORG


More information about the freebsd-scsi mailing list