zfs pool ssd cache drive dropping off

cruxpot cruxpot at gmail.com
Wed Oct 14 20:34:35 UTC 2015


When I first put it in, it was working fine but I logged in one day and
that was the state it was in. It dropped off after 7 days. After a recent
reboot, it dropped off an hour later according to the timestamps. Are there
some diagnostic commands on FreeBSD to help me determine if the SSD is
going failing or not? I'm wondering if ZFS killed it. I simply added to a
raid-z with "zpool add <pool> cache /dev/ada4" command. I can try a
different SATA cable and port but it is probably a small chance that is the
problem because this seems to be an intermittent issue.

On Wed, Oct 14, 2015 at 1:40 PM, Juan Bernhard <juan at inti.gob.ar> wrote:

>
> El 14/10/2015 a las 02:24 p.m., cruxpot escribió:
>
>> I recently added a Crucial 64GB SSD drive that I had lying around to my
>> zfs
>> pool. unfortunately, it keeps dropping off and I'm not sure why. The drive
>> wasn't failed when I removed it from an old laptop. It has happened twice
>> and only system restart brings it back. Here are the log messages, they
>> repeat but here is the base mess:
>>
>>
>>   zpool status
>>    pool: zrewt
>>   state: ONLINE
>> status: One or more devices has been removed by the administrator.
>>          Sufficient replicas exist for the pool to continue functioning
>> in a
>>          degraded state.
>> action: Online the device using 'zpool online' or replace the device with
>>          'zpool replace'.
>>    scan: none requested
>> config:
>>
>>          NAME                    STATE     READ WRITE CKSUM
>>          zrewt                   ONLINE       0     0     0
>>            raidz1-0              ONLINE       0     0     0
>>              ada0                ONLINE       0     0     0
>>              ada1                ONLINE       0     0     0
>>              ada2                ONLINE       0     0     0
>>              ada3                ONLINE       0     0     0
>>          cache
>>            16818205039835910221  REMOVED      0     0     0  was /dev/ada4
>>
>> errors: No known data errors
>>
>> kernel:
>> Trying to mount root from zfs:zrewt []...
>> ahcich4: Timeout on slot 0 port 0
>> ahcich4: is 00000000 cs 00000000 ss 00000001 rs 00000001 tfd 40 serr
>> 00000000 cmd 0004c017
>> (ada4:ahcich4:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 38 78 2f 05 40 00 00 00
>> 00 00 00
>> (ada4:ahcich4:0:0:0): CAM status: Command timeout
>> (ada4:ahcich4:0:0:0): Retrying command
>> ahcich4: AHCI reset: device not ready after 31000ms (tfd = 00000080)
>> ahcich4: Timeout on slot 1 port 0
>> ahcich4: is 00000000 cs 00000002 ss 00000000 rs 00000002 tfd 80 serr
>> 00000000 cmd 0004c117
>> (aprobe0:ahcich4:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00
>> 00 00
>> (aprobe0:ahcich4:0:0:0): CAM status: Command timeout
>> (aprobe0:ahcich4:0:0:0): Retrying command
>> ahcich4: AHCI reset: device not ready after 31000ms (tfd = 00000080)
>> ahcich4: Timeout on slot 2 port 0
>> ahcich4: is 00000000 cs 00000004 ss 00000000 rs 00000004 tfd 80 serr
>> 00000000 cmd 0004c217
>> (aprobe0:ahcich4:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00
>> 00 00
>> (aprobe0:ahcich4:0:0:0): CAM status: Command timeout
>> (aprobe0:ahcich4:0:0:0): Error 5, Retries exhausted
>> ahcich4: AHCI reset: device not ready after 31000ms (tfd = 00000080)
>> ahcich4: Timeout on slot 3 port 0
>> ahcich4: is 00000000 cs 00000008 ss 00000000 rs 00000008 tfd 80 serr
>> 00000000 cmd 0004c317
>> (aprobe0:ahcich4:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00
>> 00 00
>> (aprobe0:ahcich4:0:0:0): CAM status: Command timeout
>> (aprobe0:ahcich4:0:0:0): Error 5, Retry was blocked
>> ada4 at ahcich4 bus 0 scbus6 target 0 lun 0
>> ada4: <M4-CT064M4SSD2 0009> s/n 0000000011290314E425 detached
>> ahcich4: AHCI reset: device not ready after 31000ms (tfd = 00000080)
>> ahcich4: Timeout on slot 4 port 0
>> ahcich4: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 80 serr
>> 00000000 cmd 0004c417
>> (aprobe0:ahcich4:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00
>> 00 00
>> (aprobe0:ahcich4:0:0:0): CAM status: Command timeout
>> (aprobe0:ahcich4:0:0:0): Retrying command
>> ahcich4: AHCI reset: device not ready after 31000ms (tfd = 00000080)
>> ahcich4: Timeout on slot 5 port 0
>> ahcich4: is 00000000 cs 00000020 ss 00000000 rs 00000020 tfd 80 serr
>> 00000000 cmd 0004c517
>> (aprobe0:ahcich4:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00
>> 00 00
>> (aprobe0:ahcich4:0:0:0): CAM status: Command timeout
>> (aprobe0:ahcich4:0:0:0): Error 5, Retries exhausted
>> ahcich4: AHCI reset: device not ready after 31000ms (tfd = 00000080)
>> ahcich4: Poll timeout on slot 7 port 0
>> ahcich4: is 00000000 cs 00000080 ss 00000000 rs 00000080 tfd 80 serr
>> 00000000 cmd 0004c717
>> (aprobe0:ahcich4:0:0:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00
>> (aprobe0:ahcich4:0:0:0): CAM status: Command timeout
>> (aprobe0:ahcich4:0:0:0): Error 5, Retries exhausted
>> ahcich4: Timeout on slot 8 port 0
>> ahcich4: is 00000000 cs 00000100 ss 00000000 rs 00000100 tfd 80 serr
>> 00000000 cmd 0004c817
>> (ada4:ahcich4:0:0:0): SETFEATURES ENABLE RCACHE. ACB: ef aa 00 00 00 40 00
>> 00 00 00 00 00
>> (ada4:ahcich4:0:0:0): CAM status: Command timeout
>> (ada4:ahcich4:0:0:0): Error 5, Periph was invalidated
>> ahcich4: AHCI reset: device not ready after 31000ms (tfd = 00000080)
>> ahcich4: Poll timeout on slot 10 port 0
>> ahcich4: is 00000000 cs 00000400 ss 00000000 rs 00000400 tfd 80 serr
>> 00000000 cmd 0004ca17
>> (aprobe0:ahcich4:0:0:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00
>> (aprobe0:ahcich4:0:0:0): CAM status: Command timeout
>> (aprobe0:ahcich4:0:0:0): Error 5, Retries exhausted
>> ahcich4: Timeout on slot 11 port 0
>> ahcich4: is 00000000 cs 00000800 ss 00000800 rs 00000800 tfd 80 serr
>> 00000000 cmd 0004cb17
>> (ada4:ahcich4:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 38 78 2f 05 40 00 00 00
>> 00 00 00
>> (ada4:ahcich4:0:0:0): CAM status: Command timeout
>> (ada4:ahcich4:0:0:0): Error 5, Periph was invalidated
>> (ada4:ahcich4:0:0:0): Periph destroyed
>> ahcich4: AHCI reset: device not ready after 31000ms (tfd = 00000080)
>> ahcich4: Poll timeout on slot 13 port 0
>> ahcich4: is 00000000 cs 00002000 ss 00000000 rs 00002000 tfd 80 serr
>> 00000000 cmd 0004cd17
>> (aprobe0:ahcich4:0:0:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00
>> (aprobe0:ahcich4:0:0:0): CAM status: Command timeout
>> (aprobe0:ahcich4:0:0:0): Error 5, Retries exhausted
>>
>
> The SSD takes 31 seconds to respond. Try to use it as a regular disk, run
> some bechmarcks on it to test it with load. If the disk was working on
> another computer, che the cable and the sata port.
>
> Saludos, Juan
> _______________________________________________
> freebsd-questions at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "
> freebsd-questions-unsubscribe at freebsd.org"
>


More information about the freebsd-questions mailing list