Failing ZFS log devices/panic

Grant Gray grant at gray.id.au
Fri Aug 31 03:04:18 UTC 2018



----- On 31 Aug, 2018, at 12:58 PM, M. Casper Lewis mclewis at genomecenter.ucdavis.edu wrote:

> On Fri, Aug 31, 2018 at 11:35:12AM +1000, Grant Gray wrote:
>> I'm going to defer to people with more experience with this hardware
>> combination than myself, but I will say I've had compatibility issues with
>> SATA SSD's on LSI SAS controllers in the past. In my case this manifested
>> as the SSD's disappearing off the bus and not returning.
> 
> And after a reboot they are back?

Yep, they would work for 2-3 days at a time before falling off the bus. Can you post any specific I/O errors from your logs?

> 
> Interesting.  This is not what we are seeing, rather an accumulation of
> errors until the device is faulted, despite both SMART and the vendor
> utility reporting a healthy drive.
> 
>> I've currently got an issue between some HGST SATA disks and a SAS3008 HBA
>> where mixing SAS and SATA disks on the same port results in intermittent
>> I/O errors on the HGST SATA disks, but not other SAS devices.
> 
> This sounds like what we are seeing, but we're not mixing SAS and SATA.
> 
>> If feasible, it may be useful to move the SSD's onto a plain SATA controller
>> as a diagnostic step.
> 
> Certainly a step to consider if our controller swap does not improve the
> situation.  That said, this is a production fileserver with about half a
> petabyte of Important Data(tm) on it, so experimentation is not exactly
> the soup of the day.  Which is to say we'd like to take this machine down
> as infrequently as possible.  Diagnostic steps sans downtime are
> preferable.

In that case, it sounds like you need a similarly configured machine you can use to reproduce and resolve the issue in parallel with production work loads.

One further point of consideration in the suitability of these SSD's for write-intensive workloads; it isn't too difficult to overwhelm an SSD with synchronous writes, potentially leading to timeouts. Have you experimented with any write-biased SSD's?

> 
> --
> M. Casper Lewis                     |   mclewis at ucdavis.edu
> Systems Administrator               |   Voice: (530) 754-7978
> Genome Center                       |
> University of California, Davis     |


More information about the freebsd-fs mailing list