error destroying zfs filesystem
Alexandr Krivulya
shuriku at shurik.kiev.ua
Sat Feb 16 09:55:13 UTC 2013
15.02.2013 14:57, Peter Maloney пишет:
> On 2013-02-15 13:44, Alexandr Kovalenko wrote:
>> On Fri, Feb 15, 2013 at 11:30 AM, Alexandr Krivulya
>> <shuriku at shurik.kiev.ua> wrote:
>>> Hello everyone!
>>>
>>> After upgrading my zfs-only system from 8.2 to 9.1 I have many errors
>>> related to zfs in my /var/log/messages:
>>>
>>> Feb 15 13:12:44 gw kernel: metaslab_free_dva(): bad DVA
>>> 0:264842321920Solaris: WARNING: metaslab_free_dva(): bad DVA 0:338480095232
>>> Feb 15 13:12:44 gw kernel: Solaris: WARNING: metaslab_free_dva(): bad
>>> DVA 0:277633901056Solaris: WARNING:
>>> Feb 15 13:12:45 gw kernel: metaslab_free_dva(): bad DVA
>>> 0:277263710208Solaris: WARNING: metaslab_free_dva(): bad DVA
>>> 0:277633606144Solaris: WARNING: metaslab_free_dva(): bad DVA
>>> 0:278349642240Solaris: WARNING: metaslab_free_dva(): bad DVA
>>> 0:278429099008Solaris: WARNING: metaslab_free_dva(): bad DVA
>>> 0:278349926400Solaris: WARNING: metaslab_free_dva(): bad DVA
>>> 0:278245378560Solaris: WARNING: metaslab_free_dva(): bad DVA
>>> 0:256838777344Solaris: WARNING: metaslab_free_dva(): bad DVA 0:327364684800
>>> Feb 15 13:12:45 gw kernel: Solaris: WARNING: metaslab_free_dva(): bad
>>> DVA 0:312373604864
>>>
>>> root at gw:/ # zpool status -v
>>> pool: zmirror
>>> state: ONLINE
>>> status: One or more devices has experienced an error resulting in data
>>> corruption. Applications may be affected.
>>> action: Restore the file in question if possible. Otherwise restore the
>>> entire pool from backup.
>>> see: http://illumos.org/msg/ZFS-8000-8A
>>> scan: scrub repaired 0 in 1h39m with 1 errors on Thu Feb 14 17:48:53 2013
>>> config:
>>>
>>> NAME STATE READ WRITE CKSUM
>>> zmirror ONLINE 0 0 2
>>> mirror-0 ONLINE 0 0 8
>>> gpt/disk01 ONLINE 0 0 8
>>> gpt/disk02 ONLINE 0 0 8
>>>
>>> errors: Permanent errors have been detected in the following files:
>>>
>>> zmirror/usr:<0x0>
>>> <0xc8>:<0x0>
>> [dd]
>>> How can I solve this issue?
>> Make smartctl -t long /dev/<your_physical_drive_here> and then take a
>> look if there any pending sectors/errors in output of smartctl -a
>> /dev/<your_physical_drive_here> ? (for both of drives used)
All tests seems to be fine:
root at gw:/usr/home/support # smartctl -l selftest /dev/ada0
smartctl 6.0 2012-10-10 r3643 [FreeBSD 9.1-RELEASE amd64] (local build)
Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 1446 -
root at gw:/usr/home/support # smartctl -l selftest /dev/ada1
smartctl 6.0 2012-10-10 r3643 [FreeBSD 9.1-RELEASE amd64] (local build)
Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 1630
smartctl also didn't show any problems, see attached file
> You could also try going in /usr and "rm" or "truncate" some files
> until the "Permanent errors have been detected" list is empty. And
> this assumes you already ran a full scrub, which you must do to remove
> the files.
Now I cannot mount this filesystem to remove files:
root at gw:/usr/home/support # zfs mount zmirror/usr
cannot mount 'zmirror/usr': mountpoint or dataset is busy
The only way I see is to backup entire pool, destroy and recreate it,
and restore from a backup.
-------------- next part --------------
root at gw:/usr/home/support # smartctl -iAH /dev/ada0
smartctl 6.0 2012-10-10 r3643 [FreeBSD 9.1-RELEASE amd64] (local build)
Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Western Digital RE4 Serial ATA
Device Model: WDC WD5003ABYX-01WERA1
Serial Number: WD-WMAYP3251340
LU WWN Device Id: 5 0014ee 0032ad53b
Firmware Version: 01.01S02
User Capacity: 500 107 862 016 bytes [500 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 7200 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS (minor revision not indicated)
SATA Version is: SATA 3.0, 3.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Sat Feb 16 11:49:32 2013 EET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 139 139 021 Pre-fail Always - 4033
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 18
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 1465
10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 16
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 15
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 2
194 Temperature_Celsius 0x0022 116 095 000 Old_age Always - 27
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0
root at gw:/usr/home/support # smartctl -iAH /dev/ada1
smartctl 6.0 2012-10-10 r3643 [FreeBSD 9.1-RELEASE amd64] (local build)
Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Western Digital RE4 Serial ATA
Device Model: WDC WD5003ABYX-01WERA1
Serial Number: WD-WMAYP3265645
LU WWN Device Id: 5 0014ee 0032c2b14
Firmware Version: 01.01S02
User Capacity: 500 107 862 016 bytes [500 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 7200 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS (minor revision not indicated)
SATA Version is: SATA 3.0, 3.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Sat Feb 16 11:48:40 2013 EET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 142 142 021 Pre-fail Always - 3875
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 15
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 1649
10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 13
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 12
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 2
194 Temperature_Celsius 0x0022 116 095 000 Old_age Always - 27
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0
More information about the freebsd-fs
mailing list