SSD errors

heasley heas at shrubbery.net
Thu Apr 13 21:16:24 UTC 2017


I have 4 SSDs in zfs raidz2 on 11.0-RELEASE-p2.  There on QD sleds rated
for sata6 to convert them from 2.5" to 3.5 slots in a Supermicro SC733TQ
chassis (2012) with a Supermicro X10SRi-F mb and using the on-board
controller (either one).  I swapped the SATA cables for some with "extra
shielding".  And the 4 HDs used previously (mix of WD and seagate 750g
and 2T) worked flawlessly.

The bios has a "SATA Device Type" option for SSH/hd, which is set to SSD
and seems to only apply to spin-up signals.  The chassis manual makes no
reference to SSDs,  I have found no fbsd configuration or recommendations
specific to SSDs.  

When I push a lot of data to them, such as an rsync, I receive errors like
the below.  If I move drives between slots, it seems to follow the chassis
slots, those closest to the power supply, but I'm not positive about this.

I suppose the questions for list are:
- have I missed any fbsd ssd-specific configuration?

- all 4 have non-zero UDMA_CRC_Error_Count counters; not many, about the
  same number, which I believe implies electrical interference - most
  likely in the cable or chassis backplane.  Should I buy some specific
  model cable?  other recommendations?

tia

(ada2:ahcich6:0:0:0): READ_FPDMA_QUEUED. ACB: 60 80 d0 c2 cf 40 06 00 00 00 00 00
(ada2:ahcich6:0:0:0): CAM status: Uncorrectable parity/CRC error
(ada2:ahcich6:0:0:0): Retrying command
(ada2:ahcich6:0:0:0): READ_FPDMA_QUEUED. ACB: 60 80 d8 c3 cf 40 06 00 00 00 00 00
(ada2:ahcich6:0:0:0): CAM status: Uncorrectable parity/CRC error
(ada2:ahcich6:0:0:0): Retrying command

(ada3:ahcich7:0:0:0): READ_FPDMA_QUEUED. ACB: 60 80 18 1d fb 40 03 00 00 00 00 00
(ada3:ahcich7:0:0:0): CAM status: Uncorrectable parity/CRC error
(ada3:ahcich7:0:0:0): Retrying command
(ada3:ahcich7:0:0:0): READ_FPDMA_QUEUED. ACB: 60 80 90 31 40 40 50 00 00 00 00 00

Device Model:     Samsung SSD 850 EVO 2TB
LU WWN Device Id: 5 002538 c4042fdb8
Firmware Version: EMT02B6Q
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4c
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Thu Apr 13 20:43:52 2017 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (    0) seconds.
Offline data collection
capabilities:                    (0x53) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 265) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   099   099   000    Old_age   Always       -       2552
 12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always       -       14
177 Wear_Leveling_Count     0x0013   100   100   000    Pre-fail  Always       -       0
179 Used_Rsvd_Blk_Cnt_Tot   0x0013   100   100   010    Pre-fail  Always       -       0
181 Program_Fail_Cnt_Total  0x0032   100   100   010    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x0032   100   100   010    Old_age   Always       -       0
183 Runtime_Bad_Block       0x0013   100   100   010    Pre-fail  Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0032   074   062   000    Old_age   Always       -       26
195 Hardware_ECC_Recovered  0x001a   200   200   000    Old_age   Always       -       0
199 UDMA_CRC_Error_Count    0x003e   099   099   000    Old_age   Always       -       33
235 Unknown_Attribute       0x0012   099   099   000    Old_age   Always       -       2
241 Total_LBAs_Written      0x0032   099   099   000    Old_age   Always       -       5911167739

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
  255        0    65535  Read_scanning was never started
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.



More information about the freebsd-hardware mailing list