Cannot replace broken hard drive with LSI HBA

Rich rincebrain at gmail.com
Mon Sep 28 14:27:09 UTC 2015


Hi Karli,
Which mps-supported HBA? Your firmware version indicates it's
something in the 92xx family, but there's a number of variants on that
flavor.

Have you played with any of the drive timeout settings in the HBA
firmware/OS/drives themselves (the dark vendor-specific magic known
variously as TLER, CCTL, ERC...)?

What models are the servers?

There are a number of possible complicating factors here - whether the
drives are SAS or SATA (and any "quirks" of the drives), whether the
backplanes are passive or have SAS expanders, what version of SAS/SATA
these backplanes are capable of handling, any firmware strangeness on
the passive or otherwise backplane...

How does the machine misbehave once you re-insert the drive?

Does the machine misbehave if you keep the drive removed?

One final quirk I'll mention is that a number of SAS expander
backplanes I've encountered sometimes will not notice a drive is
physically pulled until a new drive is inserted, and sometimes the
best way to convince it to see a drive after pulling one that was
misbehaving is:
- seat a "new" (not otherwise in the machine) drive
- unseat said drive after a few seconds
- seat whatever drive you intended to seat in the first place, be it
"new" or the original drive

Good luck,

- Rich

On Mon, Sep 28, 2015 at 9:36 AM, Karli Sjöberg <karli.sjoberg at slu.se> wrote:
> Hey all!
>
> I´m just giving a shout out here to see if anyone else have had similar
> experiences working with LSI/Avago HBA's in FreeBSD.
>
> For some time now, about a year or so, we´ve had several times were hard
> drives have dropped out, you pull it out, pop a new back in, but it
> never shows up in the OS. When inserted, nothing prints in the logs, and
> physically, it just blinks for a half a second, then nothing. The entire
> server then needs to be rebooted to get the drive back.
>
> As for the hardware, we have several SuperMicro servers, an HP, and an
> old SUN server that all have this problem. It´s happened with both old
> and new drives from different manufacturers and sizes. The only thing in
> common has been the LSI/Avago HBA.
>
> The software is FreeBSD-10.1-STABLE as per this[*] bug, very close to
> 10.2-RELEASE, mps driver version 20 and the firmware has been flashed to
> 19. Also tried firmware version 20 but ZFS went nuts, displaying
> checksum errors on just about every disk in the pool.
>
> I´ts gotten to the point I´m fed up and have to ask if someone else
> could think of a fix, since neither software nor firmware upgrade seems
> to make a difference. Or to suggest another HBA instead?
>
> Thanks in advance!
>
> /K
>
> [*]: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=191348
>
> _______________________________________________
> freebsd-fs at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"


More information about the freebsd-fs mailing list