AOC-USAS2-L8i zfs panics and SCSI errors in messages

Douglas Gilbert dgilbert at interlog.com
Mon Nov 7 15:36:27 UTC 2011


On 11-11-07 03:56 AM, Rich wrote:
> Observation - the LSI SAS expanders, in my experience, sometimes
> misbehave when there are drives which respond slower than some timeout
> to commands (as far as I've seen it's only SATA drives it does this
> for, but I don't have many SAS drives for comparison), leading to all
> further commands to that drive for a bit not working, and then what
> happens depending on the OS varies dramatically.
>
> If you could try without an expander (e.g. with 1->4 SAS->SATA fanout
> cables), you may be surprised (and/or annoyed) to find your life gets
> better.

SAS-2 expanders are better than the original generation.
[LSI makes both.] SAS-2 added the CONFIGURE GENERAL SMP
function which contains various timeout tweaks for the
STP protocol (i.e. the protocol that tunnels (S)ATA
commands between a SAS HBA (initiator) and an expander).

If you are using SAS-2 expanders and FreeBSD 9.0 then you
can fetch my smp_utils package and use the smp_conf_general
utility to change those timeout settings. If you have SAS-2
expanders but an older version of FreeBSD then you will
need Solaris or Linux to run my smp_utils package in order
to change those timeout values on the expander.

Doug Gilbert

BTW smp_rep_general will show the current settings of those
STP timeouts.

> On Mon, Nov 7, 2011 at 3:48 AM, Karli Sjöberg<Karli.Sjoberg at slu.se>  wrote:
>> As a test, I have copied in about 1.5TB and scrubbed several times without any panic. It stayed solid until periodic weekly:( Same panic as with daily.
>>
>> /Karli Sjöberg
>>
>> 26 okt 2011 kl. 12.16 skrev Jeremy Chadwick:
>>
>> On Wed, Oct 26, 2011 at 11:36:44AM +0200, Karli Sj?berg wrote:
>> Hi all,
>>
>> I tracked down what causes the panics!
>>
>> I got a tip from aragon and phoenix at the forum about
>> /etc/periodic/security/100.chksetuid
>>
>> And to put:
>> daily_status_security_chksetuid_enable="NO"
>> into /etc/periodic.conf
>>
>> This is not truly the cause of the panic, it simply exacerbates it.
>>
>> Many of the periodic scripts will do things like iterate over all files
>> on the filesystem looking for specific attributes, etc..  This tends to
>> stress filesystems heavily.  This isn't the only one.  :-)
>>
>> I can now run periodic daily without any panics. I?m still wondering
>> about the cause of this, the explanation from the forum was that that
>> phase is too demanding for multi TB systems. But I have several multi
>> TB servers with FreeBSD and ZFS, and none of them has ever behaved
>> this way. Besides, the panic is instantaneous, not degenerative. I
>> imagine that a run like that would start out OK and then just get
>> worse and worse, getting gradually slower and slower until it just
>> wouldn?t cope any more and hang. This feels more like hitting a wall.
>> As if it found something that is couldn?t deal with and has no choice
>> but to panic immediately.
>>
>> It may be possible that you have some underlying filesystem corruption
>> that triggers this situation.  Have you actually tried doing a "zpool
>> scrub" of your pools and seeing if any errors happen or if the panic
>> occurs there?
>>
>> I'm inclined to think what you're experiencing is probably a bug or
>> "quirk" in the storage controller driver you're using.  There are other
>> drivers that have had fixes applied to them "to make them work decently
>> with ZFS", meaning the kind of stressful I/O ZFS puts on them results in
>> the controller driver behaving oddly or freaking out, case in point.  It
>> could also be a controller firmware bug/quirk/design issue.  Seriously.
>>
>> I believe the AOC-USAS2-L8i controller has been discussed on
>> freebsd-stable, re: mps(4) driver problems or equivalent, but I'm not
>> going to CC that list given that there would be 3 cross-posted lists
>> involved and that is liable to upset some folks.  You should search the
>> mailing lists for discussion of Supermicro controllers that work
>> reliably with FreeBSD.
>>
>> It would be worthwhile to discuss this condition on -stable, mainly with
>> something like "Anyone else using the AOC-USAS2-L8i reliably with ZFS?"
>> You get the idea.
>>
>> --
>> | Jeremy Chadwick                                jdc at parodius.com<http://parodius.com>  |
>> | Parodius Networking                       http://www.parodius.com/ |
>> | UNIX Systems Administrator                   Mountain View, CA, US |
>> | Making life hard for others since 1977.               PGP 4BD6C0CB |
>>
>>
>>
>>
>> Med Vänliga Hälsningar
>> -------------------------------------------------------------------------------
>> Karli Sjöberg
>> Swedish University of Agricultural Sciences
>> Box 7079 (Visiting Address Kronåsvägen 8)
>> S-750 07 Uppsala, Sweden
>> Phone:  +46-(0)18-67 15 66
>> karli.sjoberg at slu.se<mailto:karli.sjoberg at adm.slu.se>
>>
>> _______________________________________________
>> freebsd-fs at freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
>>
> _______________________________________________
> freebsd-scsi at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-scsi
> To unsubscribe, send any mail to "freebsd-scsi-unsubscribe at freebsd.org"
>



More information about the freebsd-fs mailing list