out of HDD space - zfs degraded
Dan Langille
dan at langille.org
Sun Oct 3 02:04:35 UTC 2010
On 10/2/2010 7:50 PM, Jeremy Chadwick wrote:
> On Sat, Oct 02, 2010 at 07:23:16PM -0400, Dan Langille wrote:
>> On 10/2/2010 6:36 PM, Jeremy Chadwick wrote:
>>> On Sat, Oct 02, 2010 at 06:09:25PM -0400, Dan Langille wrote:
>>>> On 10/2/2010 10:19 AM, Jeremy Chadwick wrote:
>>>>> On Sat, Oct 02, 2010 at 09:43:30AM -0400, Dan Langille wrote:
>>>>>> Overnight I was running a zfs send | zfs receive (both within the
>>>>>> same system / zpool). The system ran out of space, a drive went off
>>>>>> line, and the system is degraded.
>>>>>>
>>>>>> This is a raidz2 array running on FreeBSD 8.1-STABLE #0: Sat Sep 18
>>>>>> 23:43:48 EDT 2010.
>>>>>>
>>>>>> The following logs are also available at
>>>>>> http://www.langille.org/tmp/zfs-space.txt<- no line wrapping
>>>>>>
>>>>>> This is what was running:
>>>>>>
>>>>>> # time zfs send storage/bacula at transfer | mbuffer | zfs receive
>>>>>> storage/compressed/bacula-mbuffer
>>>>>> in @ 0.0 kB/s, out @ 0.0 kB/s, 3670 GB total, buffer 100%
>>>>>> fullcannot receive new filesystem stream: out of space
>>>>>> mbuffer: error: outputThread: error writing to<stdout> at offset
>>>>>> 0x395917c4000: Broken pipe
>>>>>>
>>>>>> summary: 3670 GByte in 10 h 40 min 97.8 MB/s
>>>>>> mbuffer: warning: error during output to<stdout>: Broken pipe
>>>>>> warning: cannot send 'storage/bacula at transfer': Broken pipe
>>>>>>
>>>>>> real 640m48.423s
>>>>>> user 8m52.660s
>>>>>> sys 211m40.862s
>>>>>>
>>>>>>
>>>>>> Looking in the logs, I see this:
>>>>>>
>>>>>> Oct 2 00:50:53 kraken kernel: (ada0:siisch0:0:0:0): lost device
>>>>>> Oct 2 00:50:54 kraken kernel: siisch0: Timeout on slot 30
>>>>>> Oct 2 00:50:54 kraken kernel: siisch0: siis_timeout is 00040000 ss
>>>>>> 40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000
>>>>>> Oct 2 00:50:54 kraken kernel: siisch0: Error while READ LOG EXT
>>>>>> Oct 2 00:50:55 kraken kernel: siisch0: Timeout on slot 30
>>>>>> Oct 2 00:50:55 kraken kernel: siisch0: siis_timeout is 00040000 ss
>>>>>> 40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000
>>>>>> Oct 2 00:50:55 kraken kernel: siisch0: Error while READ LOG EXT
>>>>>> Oct 2 00:50:56 kraken kernel: siisch0: Timeout on slot 30
>>>>>> Oct 2 00:50:56 kraken kernel: siisch0: siis_timeout is 00040000 ss
>>>>>> 40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000
>>>>>> Oct 2 00:50:56 kraken kernel: siisch0: Error while READ LOG EXT
>>>>>> Oct 2 00:50:57 kraken kernel: siisch0: Timeout on slot 30
>>>>>> Oct 2 00:50:57 kraken kernel: siisch0: siis_timeout is 00040000 ss
>>>>>> 40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000
>>>>>> Oct 2 00:50:57 kraken kernel: siisch0: Error while READ LOG EXT
>>>>>> Oct 2 00:50:58 kraken kernel: siisch0: Timeout on slot 30
>>>>>> Oct 2 00:50:58 kraken kernel: siisch0: siis_timeout is 00040000 ss
>>>>>> 40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000
>>>>>> Oct 2 00:50:58 kraken kernel: siisch0: Error while READ LOG EXT
>>>>>> Oct 2 00:50:59 kraken root: ZFS: vdev I/O failure, zpool=storage
>>>>>> path=/dev/gpt/disk06-live offset=270336 size=8192 error=6
>>>>>>
>>>>>> Oct 2 00:50:59 kraken kernel: (ada0:siisch0:0:0:0): Synchronize
>>>>>> cache failed
>>>>>> Oct 2 00:50:59 kraken kernel: (ada0:siisch0:0:0:0): removing device entry
>>>>>>
>>>>>> Oct 2 00:50:59 kraken root: ZFS: vdev I/O failure, zpool=storage
>>>>>> path=/dev/gpt/disk06-live offset=2000187564032 size=8192 error=6
>>>>>> Oct 2 00:50:59 kraken root: ZFS: vdev I/O failure, zpool=storage
>>>>>> path=/dev/gpt/disk06-live offset=2000187826176 size=8192 error=6
>>>>>>
>>>>>> $ zpool status
>>>>>> pool: storage
>>>>>> state: DEGRADED
>>>>>> scrub: scrub in progress for 5h32m, 17.16% done, 26h44m to go
>>>>>> config:
>>>>>>
>>>>>> NAME STATE READ WRITE CKSUM
>>>>>> storage DEGRADED 0 0 0
>>>>>> raidz2 DEGRADED 0 0 0
>>>>>> gpt/disk01-live ONLINE 0 0 0
>>>>>> gpt/disk02-live ONLINE 0 0 0
>>>>>> gpt/disk03-live ONLINE 0 0 0
>>>>>> gpt/disk04-live ONLINE 0 0 0
>>>>>> gpt/disk05-live ONLINE 0 0 0
>>>>>> gpt/disk06-live REMOVED 0 0 0
>>>>>> gpt/disk07-live ONLINE 0 0 0
>>>>>>
>>>>>> $ zfs list
>>>>>> NAME USED AVAIL REFER MOUNTPOINT
>>>>>> storage 6.97T 1.91T 1.75G /storage
>>>>>> storage/bacula 4.72T 1.91T 4.29T /storage/bacula
>>>>>> storage/compressed 2.25T 1.91T 46.9K /storage/compressed
>>>>>> storage/compressed/bacula 2.25T 1.91T 42.7K /storage/compressed/bacula
>>>>>> storage/pgsql 5.50G 1.91T 5.50G /storage/pgsql
>>>>>>
>>>>>> $ sudo camcontrol devlist
>>>>>> Password:
>>>>>> <Hitachi HDS722020ALA330 JKAOA28A> at scbus2 target 0 lun 0 (pass1,ada1)
>>>>>> <Hitachi HDS722020ALA330 JKAOA28A> at scbus3 target 0 lun 0 (pass2,ada2)
>>>>>> <Hitachi HDS722020ALA330 JKAOA28A> at scbus4 target 0 lun 0 (pass3,ada3)
>>>>>> <Hitachi HDS722020ALA330 JKAOA28A> at scbus5 target 0 lun 0 (pass4,ada4)
>>>>>> <Hitachi HDS722020ALA330 JKAOA28A> at scbus6 target 0 lun 0 (pass5,ada5)
>>>>>> <Hitachi HDS722020ALA330 JKAOA28A> at scbus7 target 0 lun 0 (pass6,ada6)
>>>>>> <ST380815AS 4.AAB> at scbus8 target 0 lun 0 (pass7,ada7)
>>>>>> <TSSTcorp CDDVDW SH-S223C SB01> at scbus9 target 0 lun 0 (cd0,pass8)
>>>>>> <WDC WD1600AAJS-75M0A0 02.03E02> at scbus10 target 0 lun 0 (pass9,ada8)
>>>>>>
>>>>>> I'm not yet sure if the drive is fully dead or not. This is not a
>>>>>> hot-swap box.
>>>>>
>>>>> It looks to me like the disk labelled gpt/disk06-live literally stopped
>>>>> responding to commands. The errors you see are coming from the OS and
>>>>> the siis(4) controller, and both indicate the actual hard disk isn't
>>>>> responding to the ATA command READ LOG EXT. error=6 means Device not
>>>>> configured.
>>>>>
>>>>> I can't see how/why running out of space would cause this. It looks
>>>>> more like that you had a hardware issue of some sort happen during the
>>>>> course of the operations you were running. It may not have happened
>>>>> until now because you hadn't utilised writes to that area of the disk
>>>>> (could have bad sectors there, or physical media/platter problems).
>>>>>
>>>>> Please provide smartctl -a output for the drive that's gpt/disk06-live,
>>>>> which I assume is /dev/ada6 (glabel sure makes correlation easy, doesn't
>>>>> it? Sigh...). Please put the results up on the web somewhere, not
>>>>> copy-pasted, otherwise I have to do a bunch of manual work with regarsd
>>>>> to line wrapping/etc... I'll provide an analysis of SMART stats for
>>>>> you, to see if anything crazy happened to the disk itself.
>>>>
>>>> It is ada0, I'm sure, based on the 'lost device' mentioned in
>>>> /var/log/messages above.
>>>>
>>>> I'm getting nowhere. /dev/ada0 does not exist so there is nothing
>>>> for smartctl to work on.
>>>>
>>>> [...]
>>>>
>>>> $ ls -l /dev/ada0*
>>>> ls: /dev/ada0*: No such file or directory
>>>
>>> Okay, so gpt/disk06-live is /dev/ada0. (I won't ask why the label is
>>> called "disk06", but whatever. :-) )
>>>
>>>> I am tempted to reboot or do a camontrol scan.
>>>
>>> DO NOT REBOOT.You can try the following -- I'm not sure whether to
>>> use scbus0 or scbus1 as the argument however, since I don't know what
>>> scbusX number ada0 was attached to previously. "dmesg" from when the
>>> machine booted would show this.
>>>
>>> camcontrol reset scbusX
>>> camcontrol rescan scbusX
>>
>> I see this in /var/run/dmesg.boot:
>>
>> ada0 at siisch0 bus 0 scbus0 target 0 lun 0
>> ada0:<Hitachi HDS722020ALA330 JKAOA28A> ATA-8 SATA 2.x device
>> ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
>> ada0: Command Queueing enabled
>>
>> $ sudo camcontrol reset scbus0
>> Password:
>> Reset of bus 0 was successful
>>
>> $ sudo camcontrol rescan scbus0
>> Re-scan of bus 0 was successful
>>
>>> If the disk comes back, please smartctl -a it.
>>
>> I didn't come back:
>>
>> $ ls /dev/ada*
>> /dev/ada1 /dev/ada2p1 /dev/ada4 /dev/ada5p1 /dev/ada7
>> /dev/ada1p1 /dev/ada3 /dev/ada4p1 /dev/ada6 /dev/ada8
>> /dev/ada2 /dev/ada3p1 /dev/ada5 /dev/ada6p1
>>
>> FYI, there's nothing new in /var/log/messages as a results of those
>> commands.
>
> Then I would recommend power-cycling (not rebooting or pressing of the
> reset button) the machine. There's a good chance the ada0 disk has
> fallen off the bus and needs a full power-cycle, since a LUN scan didn't
> result in its reappearance.
>
> I see this kind of problem on a weekly basis at my workplace, in 3
> different datacenters, with Fujitsu SCSI-3 disks. A system reboot
> doesn't make the disk reappear on on the bus, nor does a reset (pressing
> of the reset button). Only a full power-cycle works. And when I say
> weekly, I'm not exaggerating.
>
> I realise your disks are Hitachi not Fujitsu, and are SATA not SCSI, but
> it really doesn't matter -- there are cases where the drive firmware is
> wedged so hard that a physical power-cycle is required.
>
> If a power-cycle works, smartctl -a /dev/ada0 the disk and save the
> SMART stats somewhere. If the same disk fails in this way again, I
> strongly recommend advance RMA'ing it (to ensure you get a completely
> different disk).
>
> Good luck!
thanks.
After a 'shutdown -p now', it was about 20 minutes before I went and
powered it up (I was on minecraft). The box came back with the missing HDD:
$ zpool status storage
pool: storage
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are
unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
raidz2 ONLINE 0 0 0
gpt/disk01-live ONLINE 0 0 0
gpt/disk02-live ONLINE 0 0 0
gpt/disk03-live ONLINE 0 0 0
gpt/disk04-live ONLINE 0 0 0
gpt/disk05-live ONLINE 0 0 0
gpt/disk06-live ONLINE 0 0 12
gpt/disk07-live ONLINE 0 0 0
smartctl output attached and and http://www.langille.org/tmp/ada0.txt
--
Dan Langille - http://langille.org/
-------------- next part --------------
$ sudo smartctl -a /dev/ada0
smartctl 5.39.1 2010-01-28 r3054 [FreeBSD 8.1-STABLE amd64] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION ===
Model Family: Hitachi Deskstar 7K2000
Device Model: Hitachi HDS722020ALA330
Serial Number: JK1131YAHLJWLV
Firmware Version: JKAOA28A
User Capacity: 2,000,398,934,016 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 4
Local Time is: Sat Oct 2 21:59:03 2010 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x84) Offline data collection activity
was suspended by an interrupting command from host.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (22771) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 255) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0
2 Throughput_Performance 0x0005 132 132 054 Pre-fail Offline - 103
3 Spin_Up_Time 0x0007 180 180 024 Pre-fail Always - 441 (Average 361)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 17
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 114 114 020 Pre-fail Offline - 38
9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 1671
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 17
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 28
193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 28
194 Temperature_Celsius 0x0002 181 181 000 Old_age Always - 33 (Lifetime Min/Max 25/42)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0
SMART Error Log Version: 0
No Errors Logged
SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
More information about the freebsd-stable
mailing list