zpool import hangs when out of space - Was: zfs pool import hangs on [tx->tx_sync_done_cv]
Steven Hartland
killing at multiplay.co.uk
Tue Oct 14 11:20:06 UTC 2014
----- Original Message -----
From: "Steven Hartland" <killing at multiplay.co.uk>
To: "K. Macy" <kmacy at freebsd.org>
Cc: "freebsd-fs at FreeBSD.org" <freebsd-fs at freebsd.org>; "mark" <Mark.Martinec at ijs.si>; "FreeBSD Stable" <freebsd-stable at freebsd.org>
Sent: Tuesday, October 14, 2014 9:14 AM
Subject: Re: zpool import hangs when out of space - Was: zfs pool import hangs on [tx->tx_sync_done_cv]
> ----- Original Message -----
> From: "K. Macy" <kmacy at freebsd.org>
>
>
>>>> Thank you both for analysis and effort!
>>>>
>>>> I can't rule out the possibility that my main system pool
>>>> on a SSD was low on space at some point in time, but the
>>>> three 4 GiB cloned pools (sys1boot and its brothers) were all
>>>> created as a zfs send / receive copies of the main / (root)
>>>> file system and I haven't noticed anything unusual during
>>>> syncing. This syncing was done manually (using zxfer) and
>>>> independently from the upgrade on the system - on a steady/quiet
>>>> system, when the source file system definitely had sufficient
>>>> free space.
>>>>
>>>> The source file system now shows 1.2 GiB of usage shown
>>>> by df:
>>>> shiny/ROOT 61758388 1271620 60486768 2% /
>>>> Seems unlikely that the 1.2 GiB has grown to 4 GiB space
>>>> on a cloned filesystem.
>>>>
>>>> Will try to import the main two pools after re-creating
>>>> a sane boot pool...
>>>
>>>
>>> Yer zfs list only shows around 2-3GB used too but zpool list
>>> shows the pool is out of space. Cant rule out an accounting
>>> issue though.
>>>
>>
>> What is using the extra space in the pool? Is there an unmounted
>> dataset or snapshot? Do you know how to easily tell? Unlike txg and
>> zio processing I don't have the luxury of having just read that part
>> of the codebase.
>
> Its not clear but I believe it could just be fragmention even though
> its ashift=9.
>
> I sent the last snapshot to another pool of the same size and it
> resulted in:
> NAME SIZE ALLOC FREE FRAG EXPANDSZ CAP DEDUP HEALTH ALTROOT
> sys1boot 3.97G 3.97G 190K 0% - 99% 1.00x ONLINE -
> sys1copy 3.97G 3.47G 512M 72% - 87% 1.00x ONLINE -
>
> I believe FRAG is 0% as the feature wasn't enabled for the lifetime of
> the pool hence its simply not showing a valid value.
>
> zfs list -t all -r sys1boot
> NAME USED AVAIL REFER MOUNTPOINT
> sys1boot 1.76G 2.08G 11K /sys1boot
> sys1boot/ROOT 1.72G 2.08G 1.20G /sys1boot/ROOT
> sys1boot/ROOT at auto-2014-08-16_04.00 1K - 1.19G -
> sys1boot/ROOT at auto-2014-08-17_04.00 1K - 1.19G -
..
Well interesting issue I left this pool alone this morning literally doing
nothing, and its now out of space.
zpool list
NAME SIZE ALLOC FREE FRAG EXPANDSZ CAP DEDUP HEALTH ALTROOT
sys1boot 3.97G 3.97G 190K 0% - 99% 1.00x ONLINE -
sys1copy 3.97G 3.97G 8K 0% - 99% 1.00x ONLINE -
There's something very wrong here as nothing has been accessing the pool.
pool: zfs
state: ONLINE
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
see: http://illumos.org/msg/ZFS-8000-HC
scan: none requested
config:
NAME STATE READ WRITE CKSUM
zfs ONLINE 0 2 0
md1 ONLINE 0 0 0
I tried destroying the pool and ever that failed, presumably because
the pool has suspended IO.
Regards
Steve
More information about the freebsd-fs
mailing list