SMART: disk problems on RAIDZ1 pool: (ada6:ahcich6:0:0:0): CAM status: ATA Status Error
O. Hartmann
ohartmann at walstatt.org
Tue Dec 12 18:22:36 UTC 2017
Hello,
running CURRENT (recent r326769), I realised that smartmond sends out some console
messages when booting the box:
[...]
Dec 12 14:14:33 <3.2> box1 smartd[68426]: Device: /dev/ada6, 1 Currently unreadable
(pending) sectors Dec 12 14:14:33 <3.2> box1 smartd[68426]: Device: /dev/ada6, 1
Offline uncorrectable sectors
[...]
Checking the drive's SMART log with smartctl (it is one of four 3TB disk drives), I
gather these informations:
[... smartctl -x /dev/ada6 ...]
Error 42 [17] occurred at disk power-on lifetime: 25335 hours (1055 days + 15 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
40 -- 51 00 00 00 00 c2 7a 72 98 40 00 Error: UNC at LBA = 0xc27a7298 = 3262804632
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
60 00 b0 00 88 00 00 c2 7a 73 20 40 08 23:38:12.195 READ FPDMA QUEUED
60 00 b0 00 80 00 00 c2 7a 72 70 40 08 23:38:12.195 READ FPDMA QUEUED
2f 00 00 00 01 00 00 00 00 00 10 40 08 23:38:12.195 READ LOG EXT
60 00 b0 00 70 00 00 c2 7a 73 20 40 08 23:38:09.343 READ FPDMA QUEUED
60 00 b0 00 68 00 00 c2 7a 72 70 40 08 23:38:09.343 READ FPDMA QUEUED
[...]
and
[...]
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate POSR-K 200 200 051 - 64
3 Spin_Up_Time POS--K 178 170 021 - 6075
4 Start_Stop_Count -O--CK 098 098 000 - 2406
5 Reallocated_Sector_Ct PO--CK 200 200 140 - 0
7 Seek_Error_Rate -OSR-K 200 200 000 - 0
9 Power_On_Hours -O--CK 066 066 000 - 25339
10 Spin_Retry_Count -O--CK 100 100 000 - 0
11 Calibration_Retry_Count -O--CK 100 100 000 - 0
12 Power_Cycle_Count -O--CK 098 098 000 - 2404
192 Power-Off_Retract_Count -O--CK 200 200 000 - 154
193 Load_Cycle_Count -O--CK 001 001 000 - 2055746
194 Temperature_Celsius -O---K 122 109 000 - 28
196 Reallocated_Event_Count -O--CK 200 200 000 - 0
197 Current_Pending_Sector -O--CK 200 200 000 - 1
198 Offline_Uncorrectable ----CK 200 200 000 - 1
199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 0
200 Multi_Zone_Error_Rate ---R-- 200 200 000 - 5
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning
[...]
The ZFS pool is RAIDZ1, comprised of 3 WD Green 3TB HDD and one WD RED 3 TB HDD. The
failure occured is on one of the WD Green 3 TB HDD.
The pool is marked as "resilvered" - I do scrubbing on a regular basis and the
"resilvering" message has now aapeared the second time in row. Searching the net
recommend on SMART attribute 197 errors, in my case it is one, and in combination with
the problems occured that I should replace the disk.
Well, here comes the problem. The box is comprised from "electronical waste" made by
ASRock - it is a Socket 1150/IvyBridge board, which has its last Firmware/BIOS update got
in 2013 and since then UEFI booting FreeBSD from a HDD isn't possible (just to indicate
that I'm aware of having issues with crap, but that is some other issue right now). The
board's SATA connectors are all populated.
So: Due to the lack of adequate backup space I can only selectively backup portions, most
of the space is occupied by scientific modelling data, which I had worked on. So backup
exists! In one way or the other. My concern is how to replace the faulty HDD! Most
HowTo's indicate a replacement disk being prepared and then "replaced" via ZFS's replace
command. This isn't applicable here.
Question: is it possible to simply pull the faulty disk (implies I know exactly which one
to pull!) and then prepare and add the replacement HDD and let the system do its job
resilvering the pool?
Next question is: I'm about to replace the 3 TB HDD with a more recent and modern 4 TB
HDD (WD RED 4TB). I'm aware of the fact that I can only use 3 TB as the other disks are 3
TB, but I'd like to know whether FreeBSD's ZFS is capable of handling it?
This is the first time I have issues with ZFS and a faulty drive, so if some of my
questions sound naive, please forgive me.
Thanks in advance,
Oliver
--
O. Hartmann
Ich widerspreche der Nutzung oder Übermittlung meiner Daten für
Werbezwecke oder für die Markt- oder Meinungsforschung (§ 28 Abs. 4 BDSG).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 313 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freebsd.org/pipermail/freebsd-current/attachments/20171212/e8aa3a05/attachment.sig>
More information about the freebsd-current
mailing list