ZFS v28: mount dataset hangs
Sergey Lobanov
wmn at siberianet.ru
Tue Dec 6 12:20:00 UTC 2011
On Thursday 06 October 2011 at 16:04:30 Sergey Gavrilov wrote:
> Hello, all.
>
> After "chown -R" hanging and reboot I cannot mount one of the
> datasets, while other pools work well.
> CTRL-T shows state [tx->tx_sync_done_cv)]
>
> Hardware: 3ware 9690SA-8I, 8 SEAGATE ST32000445SS in raidz2, 48 Gb
> RAM, 2 E5520 CPU
> Software: FreeBSD 8.2-STABLE #3: Wed Oct 5 18:04:50 MSD 2011
> /usr/obj/usr/src/sys/GENERIC amd64, ZFS V28 compression=on
>
> I can clone and mount any snapshot of the dataset, but not itself.
> No errors in zpool status or system messages. Scrub has finished clear.
>
> Let me know if I could provide any additional information to solve the
> issue.
Hello.
Same problem here.
raidz2 using 4 disks (made with fake 4k sector size using gnop for possible
future replacement by native 4k disks) with dedup, lzjb and NTFSv4 ACL which
is shared over samba3.6 for windows clients. System freezed (even local log-in
via ipmi didn't work) during flush of recycle (trash) over smb from windows
station on one of datasets (a lot of files were being removed). After hard
reset via ipmi, zfs hangs on mount of that dataset under 8.2 and 9.0-RC2 (but
"zfs set mountpoint" works well). Scrubs under 8.2 and 9.0-RC2 did not help
and found nothing, other datasets can be mounted without problems.
Storage consists of 2 seagate disks (ST3160318AS CC38) with ufs for system, 4
WD (WD10EALX-009BA0 15.01H15) for zfs share and Qlogic 2432 FC adapter which
is connected to HP tape library via FC-switch for bacula backups.
All WD disks were checked with smart long self-test, went without errors, also
no reallocated sectors (and events) in smart output. here is example for one
of the disks:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED
WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always
- 0
3 Spin_Up_Time 0x0027 177 175 021 Pre-fail Always
- 4108
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always
- 60
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always
- 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always
- 0
9 Power_On_Hours 0x0032 092 092 000 Old_age Always
- 6183
10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always
- 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always
- 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always
- 58
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always
- 47
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always
- 12
194 Temperature_Celsius 0x0022 122 110 000 Old_age Always
- 25
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always
- 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always
- 0
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline -
0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always
- 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline -
85
Here is some other output:
---
# zfs mount data/storage/public
load: 0.15 cmd: zfs 1352 [tx->tx_sync_done_cv)] 348.58r 0.00u 0.92s 0% 2296k
# procstat -kk 1352
PID TID COMM TDNAME KSTACK
1352 100081 zfs initial thread mi_switch+0x176
sleepq_wait+0x42 _cv_wait+0x129 txg_wait_synced+0x85 zil_replay+0x10a
zfsvfs_setup+0x117 zfs_mount+0x52f vfs_donmount+0xdc5 nmount+0x63
amd64_syscall+0x1f4 Xfast_syscall+0xfc
---
# zfs snapshot data/storage/public at test
load: 0.02 cmd: zfs 1382 [tx->tx_sync_done_cv)] 4.34r 0.00u 0.00s 0% 2280k
load: 0.09 cmd: zfs 1382 [tx->tx_sync_done_cv)] 19.13r 0.00u 0.00s 0% 2280k
load: 0.01 cmd: zfs 1352 [tx->tx_sync_done_cv)] 611.74r 0.00u 0.92s 0% 2296k
# procstat -kk 1382
PID TID COMM TDNAME KSTACK
1382 100295 zfs initial thread mi_switch+0x176
sleepq_wait+0x42 _cv_wait+0x129 txg_wait_synced+0x85
dsl_sync_task_group_wait+0x128 dmu_objset_snapshot+0x302
zfs_ioc_snapshot+0x1a8 zfsdev_ioctl+0xe6 devfs_ioctl_f+0x7b kern_ioctl+0x102
ioctl+0xfd amd64_syscall+0x1f4 Xfast_syscall+0xfc
---
# procstat -kk 6
PID TID COMM TDNAME KSTACK
6 100058 zfskern arc_reclaim_thre mi_switch+0x176
sleepq_timedwait+0x42 _cv_timedwait+0x134 arc_reclaim_thread+0x29d
fork_exit+0x11f fork_trampoline+0xe
6 100059 zfskern l2arc_feed_threa mi_switch+0x176
sleepq_timedwait+0x42 _cv_timedwait+0x134 l2arc_feed_thread+0x1a8
fork_exit+0x11f fork_trampoline+0xe
6 100279 zfskern txg_thread_enter mi_switch+0x176
sleepq_wait+0x42 _cv_wait+0x129 txg_thread_wait+0x79 txg_quiesce_thread+0xb5
fork_exit+0x11f fork_trampoline+0xe
6 100280 zfskern txg_thread_enter mi_switch+0x176
sleepq_wait+0x42 _cv_wait+0x129 zio_wait+0x61 dbuf_read+0x5e5
dmu_buf_hold+0xe0 zap_idx_to_blk+0xa3 zap_deref_leaf+0x4f fzap_remove+0x33
zap_remove_uint64+0x85 ddt_sync+0x267 spa_sync+0x383 txg_sync_thread+0x139
fork_exit+0x11f fork_trampoline+0xe
---
And here is some interesting screenshot of output from 9.0-RC2 livecd shell,
which leads to beleive that there was swap exhaustion while zfs mount were
freezed: http://imageshack.us/photo/my-images/64/90rc2zfsmounthangswap.gif/
(didn't get this on 8.2 and 9.0-RC1 though didn't specially try to let system
be in freeze for a long time).
Hardware is supermicro X8SIA-F, core i3, 4Gb of memory.
releng8 r226341 amd64 (with e1000 from 8.2-release for another reason), custom
kernel with config:
include GENERIC
ident IPFW_FWD
options IPFIREWALL
options IPFIREWALL_DEFAULT_TO_ACCEPT
options IPFIREWALL_FORWARD
ZFS-related tuning in loader.conf is only following line:
vfs.zfs.arc_max="1073741824"
I have couple of days for debugging (i plan to try releng9 today or tomorrow
because i can see some zfs commits there after RC2), after which i'll have to
move this machine to production with zfs or without it.
If you need something else, let me know, i've subscribed to the list, so there
is no need to CC me.
--
ISP SiberiaNet
System and Network Administrator
More information about the freebsd-fs
mailing list