ZFS inresponsive.
Fabian Keil
freebsd-listen at fabiankeil.de
Sun Oct 31 09:56:33 UTC 2010
Peter Ankerstål <peter at pean.org> wrote:
> I accidentally aborted a zfs send mid-tranfer.
>
> Now the receiveing side cant do anything with the affected filesystem. both zfs list and zfs destroy hangs.
> How can I fix this? I can list other filesystems without a problem but a general list or any command concerning
> the one the got broken in transer hangs the zfs command.
I occasionally see this, too, when sending snapshots to zpools on
USB disks using geli. If the device gets lost in transfer, general
zfs and zpool commands hang:
Oct 18 17:09:28 r500 sudo: fk : TTY=pts/9 ; PWD=/home/fk ; USER=root ; COMMAND=/sbin/geli attach -k /home/fk/geli-keys/poppstar.key /dev/label/poppstar
Oct 18 17:09:34 r500 sudo: fk : TTY=pts/9 ; PWD=/home/fk ; USER=root ; COMMAND=/sbin/zpool import poppstar
Oct 18 17:09:34 r500 kernel: GEOM_ELI: Device label/poppstar.eli created.
Oct 18 17:09:34 r500 kernel: GEOM_ELI: Encryption: AES-CBC 128
Oct 18 17:09:34 r500 kernel: GEOM_ELI: Crypto: software
Oct 18 17:09:35 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status error
Oct 18 17:09:35 r500 kernel: (cd0:ahcich1:0:0:0): Requesting SCSI sense data
Oct 18 17:09:35 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status error
Oct 18 17:09:35 r500 kernel: (cd0:ahcich1:0:0:0): READ CAPACITY. CDB: 25 0 0 0 0 0 0 0 0 0
Oct 18 17:09:35 r500 kernel: (cd0:ahcich1:0:0:0): CAM status: SCSI Status Error
Oct 18 17:09:35 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status: Check Condition
Oct 18 17:09:35 r500 kernel: (cd0:ahcich1:0:0:0): SCSI sense: NOT READY asc:3a,1 (Medium not present - tray closed)
Oct 18 17:09:35 r500 kernel: (cd0:ahcich1:0:0:0): Error 6, Unretryable error
Oct 18 17:09:58 r500 sudo: fk : TTY=pts/9 ; PWD=/home/fk ; USER=root ; COMMAND=/sbin/zfs create poppstar/dvds/sliders
Oct 18 17:10:32 r500 sudo: fk : TTY=pts/9 ; PWD=/home/fk ; USER=root ; COMMAND=/sbin/zfs send wde2/dvds/sliders/season-1-2 at 2009-12-05
Oct 18 17:10:32 r500 sudo: fk : TTY=pts/9 ; PWD=/home/fk ; USER=root ; COMMAND=/sbin/zfs receive -v poppstar/dvds/sliders/season-1-2
Oct 18 17:38:06 r500 sudo: fk : TTY=pts/9 ; PWD=/home/fk ; USER=root ; COMMAND=/sbin/zfs receive -v poppstar/dvds/sliders/season-3
Oct 18 17:38:06 r500 sudo: fk : TTY=pts/9 ; PWD=/home/fk ; USER=root ; COMMAND=/sbin/zfs send wde2/dvds/sliders/season-3 at 2009-03-28
Oct 18 17:39:47 r500 kernel: acpi_lid0: Lid closed
Oct 18 18:13:19 r500 kernel: acpi_lid0: Lid opened
Oct 18 18:22:56 r500 sudo: fk : TTY=pts/9 ; PWD=/home/fk ; USER=root ; COMMAND=/sbin/zfs send wde2/dvds/sliders/season-4 at 2009-12-05
Oct 18 18:22:56 r500 sudo: fk : TTY=pts/9 ; PWD=/home/fk ; USER=root ; COMMAND=/sbin/zfs receive -v poppstar/dvds/sliders/season-4
Oct 18 18:33:05 r500 kernel: ugen3.3: <Generic> at usbus3 (disconnected)
Oct 18 18:33:05 r500 kernel: umass2: at uhub3, port 1, addr 3 (disconnected)
Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): SCSI status error
Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): Requesting SCSI sense data
Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): AutoSense failed
Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): Error 5, Unretryable error
Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error=5). label/poppstar.eli[WRITE(offset=362645020672, length=131072)]
Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error=5). label/poppstar.eli[WRITE(offset=362645151744, length=131072)]
Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error=5). label/poppstar.eli[WRITE(offset=362645282816, length=131072)]
Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error=5). label/poppstar.eli[WRITE(offset=362645413888, length=131072)]
Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error=5). label/poppstar.eli[WRITE(offset=362645544960, length=131072)]
Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error=5). label/poppstar.eli[WRITE(offset=362645676032, length=131072)]
Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error=5). label/poppstar.eli[WRITE(offset=362645807104, length=131072)]
Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error=5). label/poppstar.eli[WRITE(offset=362645938176, length=131072)]
Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error=5). label/poppstar.eli[WRITE(offset=362646069248, length=131072)]
Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error=5). label/poppstar.eli[WRITE(offset=362644889600, length=131072)]
Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): Selection timeout
Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): Retrying command
Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): Selection timeout
Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): Retrying command
Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): Selection timeout
Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): Retrying command
Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): Selection timeout
Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): Retrying command
Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): lost device
Oct 18 18:33:05 r500 kernel: (pass4:umass-sim2:2:0:0): lost device
Oct 18 18:33:05 r500 kernel: (pass4:umass-sim2:2:0:0): removing device entry
Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): Error 6, Unretryable error
Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0):G EOM_ELIInvalidating pack:
Oct 18 18:33:05 r500 kernel: g_eli_read_done() failed label/poppstar.eli[READ(offset=2000398327808, length=8192)]
Oct 18 18:33:05 r500 kernel: GEOM_ELI: g_eli_read_done() failed label/poppstar.eli[READ(offset=270336, length=8192)]
Oct 18 18:33:05 r500 kernel: GEOM_ELI: g_eli_read_done() failed label/poppstar.eli[READ(offset=2000398589952, length=8192)]
Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error=6). label/poppstar.eli[WRITE(offset=362646200320, length=131072)]
Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error=6). label/poppstar.eli[WRITE(offset=362646331392, length=131072)]
Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error=6). label/poppstar.eli[WRITE(offset=362646593536, length=131072)]
Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error=6). label/poppstar.eli[WRITE(offset=362646724608, length=131072)]
Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error=6). label/poppstar.eli[WRITE(offset=362646462464, length=131072)]
Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error=6). label/poppstar.eli[WRITE(offset=362646855680, length=131072)]
Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error=6). label/poppstar.eli[WRITE(offset=362646986752, length=131072)]
Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error=6). label/poppstar.eli[WRITE(offset=362647117824, length=131072)]
Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error=6). label/poppstar.eli[WRITE(offset=362647248896, length=131072)]
Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error=6). label/poppstar.eli[WRITE(offset=362647379968, length=131072)]
Oct 18 18:33:05 r500 root: ZFS: vdev I/O failure, zpool=poppstar path=/dev/label/poppstar.eli offset=2000398327808 size=8192 error=6
Oct 18 18:33:05 r500 root: ZFS: vdev I/O failure, zpool=poppstar path=/dev/label/poppstar.eli offset=270336 size=8192 error=6
Oct 18 18:33:05 r500 root: ZFS: vdev I/O failure, zpool=poppstar path=/dev/label/poppstar.eli offset=2000398589952 size=8192 error=6
Oct 18 18:33:05 r500 root: ZFS: zpool I/O failure, zpool=poppstar error=6
Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): Synchronize cache failed, status == 0xa, scsi status == 0x0
Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): removing device entry
Oct 18 18:33:05 r500 root: ZFS: zpool I/O failure, zpool=poppstar error=6
Oct 18 18:33:05 r500 last message repeated 151 times
Oct 18 18:33:05 r500 root: ZFS: zpool I/O failure, zpool=poppstar error=28
Oct 18 18:33:06 r500 last message repeated 161 times
Oct 18 18:33:06 r500 root: ZFS: vdev I/O failure, zpool=poppstar path= offset= size= error=
Oct 18 18:33:06 r500 root: Unknown USB device: vendor 0x13fd product 0x1240 bus uhub3
Oct 18 18:33:06 r500 kernel: ugen3.3: <Generic> at usbus3
Oct 18 18:33:06 r500 kernel: umass2: <Generic External, class 0/0, rev 2.00/2.10, addr 3> on usbus3
Oct 18 18:33:06 r500 kernel: umass2: SCSI over Bulk-Only; quirks = 0x0000
Oct 18 18:33:08 r500 kernel: umass2:4:2:-1: Attached to scbus4
Oct 18 18:33:08 r500 kernel: pass4 at umass-sim2 bus 2 scbus4 target 0 lun 0
Oct 18 18:33:08 r500 kernel: pass4: <Generic External 2.10> Fixed Direct Access SCSI-4 device
Oct 18 18:33:08 r500 kernel: pass4: Serial Number 395857304652454620202020
Oct 18 18:33:08 r500 kernel: pass4: 40.000MB/s transfers
Oct 18 18:33:08 r500 kernel: GEOM: new disk da2
Oct 18 18:33:08 r500 kernel: da2 at umass-sim2 bus 2 scbus4 target 0 lun 0
Oct 18 18:33:08 r500 kernel: da2: <Generic External 2.10> Fixed Direct Access SCSI-4 device
Oct 18 18:33:08 r500 kernel: da2: Serial Number 395857304652454620202020
Oct 18 18:33:08 r500 kernel: da2: 40.000MB/s transfers
Oct 18 18:33:08 r500 kernel: da2: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C)
Oct 18 18:34:02 r500 sudo: fk : TTY=pts/14 ; PWD=/home/fk ; USER=root ; COMMAND=/sbin/zpool list
Oct 18 18:34:02 r500 root: ZFS: zpool I/O failure, zpool=poppstar error=6
Oct 18 18:34:02 r500 root: ZFS: vdev I/O failure, zpool=poppstar path= offset= size= error=
Oct 18 18:36:34 r500 kernel: ugen3.3: <Generic> at usbus3 (disconnected)
Oct 18 18:36:34 r500 kernel: umass2: at uhub3, port 1, addr 3 (disconnected)
Oct 18 18:36:34 r500 kernel: (da2:umass-sim2:2:0:0): lost device
Oct 18 18:36:34 r500 kernel: (da2:umass-sim2:2:0:0): removing device entry
Oct 18 18:36:34 r500 kernel: (pass4:umass-sim2:2:0:0): lost device
Oct 18 18:36:34 r500 kernel: (pass4:umass-sim2:2:0:0): removing device entry
Oct 18 18:36:44 r500 root: Unknown USB device: vendor 0x13fd product 0x1240 bus uhub3
Oct 18 18:36:44 r500 kernel: ugen3.3: <Generic> at usbus3
Oct 18 18:36:44 r500 kernel: umass2: <Generic External, class 0/0, rev 2.00/2.10, addr 3> on usbus3
Oct 18 18:36:44 r500 kernel: umass2: SCSI over Bulk-Only; quirks = 0x0000
Oct 18 18:36:46 r500 kernel: umass2:4:2:-1: Attached to scbus4
Oct 18 18:36:46 r500 kernel: pass4 at umass-sim2 bus 2 scbus4 target 0 lun 0
Oct 18 18:36:46 r500 kernel: pass4: <Generic External 2.10> Fixed Direct Access SCSI-4 device
Oct 18 18:36:46 r500 kernel: pass4: Serial Number 395857304652454620202020
Oct 18 18:36:46 r500 kernel: pass4: 40.000MB/s transfers
Oct 18 18:36:46 r500 kernel: GEOM: new disk da2
Oct 18 18:36:46 r500 kernel: da2 at umass-sim2 bus 2 scbus4 target 0 lun 0
Oct 18 18:36:46 r500 kernel: da2: <Generic External 2.10> Fixed Direct Access SCSI-4 device
Oct 18 18:36:46 r500 kernel: da2: Serial Number 395857304652454620202020
Oct 18 18:36:46 r500 kernel: da2: 40.000MB/s transfers
Oct 18 18:36:46 r500 kernel: da2: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C)
Oct 18 18:36:46 r500 kernel: ugen3.3: <Generic> at usbus3 (disconnected)
Oct 18 18:36:46 r500 kernel: umass2: at uhub3, port 1, addr 3 (disconnected)
Oct 18 18:36:46 r500 kernel: ugen7.3: <Western Digital> at usbus7 (disconnected)
Oct 18 18:36:46 r500 kernel: umass1: at uhub7, port 1, addr 3 (disconnected)
Oct 18 18:36:46 r500 kernel: (da2:umass-sim2:2:0:0): lost device
Oct 18 18:36:46 r500 kernel: (da2:umass-sim2:2:0:0): removing device entry
Oct 18 18:36:46 r500 kernel: (pass4:umass-sim2:2:0:0): lost device
Oct 18 18:36:46 r500 kernel: (pass4:umass-sim2:2:0:0): removing device entry
Oct 18 18:36:46 r500 kernel: (da1:umass-sim1:1:0:0): lost device
Oct 18 18:36:46 r500 kernel: (pass3:umass-sim1:1:0:0): lost device
Oct 18 18:36:46 r500 kernel: (pass3:umass-sim1:1:0:0): removing device entry
Oct 18 18:36:46 r500 kernel: (da1:umass-sim1:1:0:0): Synchronize cache failed, status == 0xa, scsi status == 0x0
Oct 18 18:36:46 r500 kernel: (da1:umass-sim1:1:0:0): removing device entry
Oct 18 18:36:46 r500 kernel: ugen7.2: <Western Digital> at usbus7 (disconnected)
Oct 18 18:36:46 r500 kernel: umass0: at uhub7, port 2, addr 2 (disconnected)
Oct 18 18:36:46 r500 kernel: (da0:umass-sim0:0:0:0): lost device
Oct 18 18:36:46 r500 kernel: (pass2:umass-sim0:0:0:0): lost device
Oct 18 18:36:46 r500 kernel: (pass2:umass-sim0:0:0:0): removing device entry
Oct 18 18:36:46 r500 kernel: (da0:umass-sim0:0:0:0): Synchronize cache failed, status == 0xa, scsi status == 0x0
Oct 18 18:36:46 r500 kernel: (da0:umass-sim0:0:0:0): removing device entry
Oct 18 18:36:54 r500 root: ZFS: zpool I/O failure, zpool=wde2 error=28
Oct 18 18:36:54 r500 last message repeated 3 times
Oct 18 18:36:54 r500 root: ZFS: vdev I/O failure, zpool=wde2 path= offset= size= error=
Oct 18 18:36:54 r500 root: ZFS: zpool I/O failure, zpool=wde3 error=28
Oct 18 18:36:54 r500 last message repeated 3 times
Oct 18 18:36:54 r500 root: ZFS: vdev I/O failure, zpool=wde3 path= offset= size= error=
fk at r500 ~ $ps waux | grep \ [z]
root 5501 0.0 0.1 13364 1288 9 I+ 6:22PM 0:00.01 sudo zfs receive -v poppstar/dvds/sliders/season-4
root 5502 0.0 0.1 18944 1312 9 D+ 6:22PM 0:21.73 zfs receive -v poppstar/dvds/sliders/season-4
fk 4711 0.0 0.1 19980 1364 11 D+ 5:10PM 0:01.03 zpool iostat poppstar 1
root 5947 0.0 0.1 13364 1728 14 I+ 6:34PM 0:00.01 sudo zpool list
root 5948 0.0 0.1 19980 1760 14 D+ 6:34PM 0:00.01 zpool list
fk 5990 0.0 0.1 19980 1744 16 D+ 6:36PM 0:00.01 zpool list -H -o name
fk at r500 ~ $sudo procstat -kk `pgrep zfs`
PID TID COMM TDNAME KSTACK
5502 100519 zfs - mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 txg_wait_synced+0x7c dmu_tx_assign+0x16c dmu_recv_stream+0x125f zfs_ioc_recv+0x282 zfsdev_ioctl+0x8f devfs_ioctl_f+0x7b kern_ioctl+0x102 ioctl+0xfd syscallenter+0x331 syscall+0x4b Xfast_syscall+0xe2
fk at r500 ~ $sudo procstat -kk `pgrep zpool`
PID TID COMM TDNAME KSTACK
5990 100507 zpool - mi_switch+0x176 sleepq_wait+0x42 _sx_xlock_hard+0x3c2 _sx_xlock+0x6e spa_all_configs+0xbf zfs_ioc_pool_configs+0x29 zfsdev_ioctl+0x8f devfs_ioctl_f+0x7b kern_ioctl+0x102 ioctl+0xfd syscallenter+0x331 syscall+0x4b Xfast_syscall+0xe2
5948 100292 zpool - mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 zio_wait+0x61 dbuf_read+0x39a dnode_hold_impl+0xf0 dmu_buf_hold+0x34 zap_lockdir+0x52 zap_cursor_retrieve+0x194 spa_prop_get+0x26f zfs_ioc_pool_get_props+0xb2 zfsdev_ioctl+0x8f devfs_ioctl_f+0x7b kern_ioctl+0x102 ioctl+0xfd syscallenter+0x331 syscall+0x4b Xfast_syscall+0xe2
4711 100269 zpool - mi_switch+0x176 sleepq_wait+0x42 _sx_xlock_hard+0x3c2 _sx_xlock+0x6e spa_open_common+0x76 spa_get_stats+0x42 zfs_ioc_pool_stats+0x2c zfsdev_ioctl+0x8f devfs_ioctl_f+0x7b kern_ioctl+0x102 ioctl+0xfd syscallenter+0x331 syscall+0x4b Xfast_syscall+0xe2
At the time I've been using:
FreeBSD r500.local 9.0-CURRENT FreeBSD 9.0-CURRENT #193 r+3be3a96: Sun Oct 17 12:51:52 CEST 2010 fk at r500.local:/usr/obj/usr/src/sys/ZOEY amd64
with ZFS pool version 15, but I saw the problem with earlier
FreeBSD and ZFS versions, too.
Fabian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20101031/8c23c8e0/signature.pgp
More information about the freebsd-fs
mailing list