New ZFSv28 patchset for 8-STABLE
Attila Nagy
bra at fsn.hu
Sun Jan 9 11:49:30 UTC 2011
On 01/09/2011 10:00 AM, Attila Nagy wrote:
> On 12/16/2010 01:44 PM, Martin Matuska wrote:
>> Hi everyone,
>>
>> following the announcement of Pawel Jakub Dawidek (pjd at FreeBSD.org) I am
>> providing a ZFSv28 testing patch for 8-STABLE.
>>
>> Link to the patch:
>>
>> http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101215.patch.xz
>>
>>
> I've got an IO hang with dedup enabled (not sure it's related, I've
> started to rewrite all data on pool, which makes a heavy load):
>
> The processes are in various states:
> 65747 1001 1 54 10 28620K 24360K tx->tx 0 6:58 0.00% cvsup
> 80383 1001 1 54 10 40616K 30196K select 1 5:38 0.00% rsync
> 1501 www 1 44 0 7304K 2504K zio->i 0 2:09 0.00% nginx
> 1479 www 1 44 0 7304K 2416K zio->i 1 2:03 0.00% nginx
> 1477 www 1 44 0 7304K 2664K zio->i 0 2:02 0.00% nginx
> 1487 www 1 44 0 7304K 2376K zio->i 0 1:40 0.00% nginx
> 1490 www 1 44 0 7304K 1852K zfs 0 1:30 0.00% nginx
> 1486 www 1 44 0 7304K 2400K zfsvfs 1 1:05 0.00% nginx
>
> And everything which wants to touch the pool is/becomes dead.
>
> Procstat says about one process:
> # procstat -k 1497
> PID TID COMM TDNAME KSTACK
> 1497 100257 nginx - mi_switch sleepq_wait
> __lockmgr_args vop_stdlock VOP_LOCK1_APV null_lock VOP_LOCK1_APV
> _vn_lock nullfs_root lookup namei vn_open_cred kern_openat
> syscallenter syscall Xfast_syscall
No, it's not related. One of the disks in the RAIDZ2 pool went bad:
(da4:arcmsr0:0:4:0): READ(6). CDB: 8 0 2 10 10 0
(da4:arcmsr0:0:4:0): CAM status: SCSI Status Error
(da4:arcmsr0:0:4:0): SCSI status: Check Condition
(da4:arcmsr0:0:4:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read
error)
and it seems it froze the whole zpool. Removing the disk by hand solved
the problem.
I've seen this previously on other machines with ciss.
I wonder why ZFS didn't throw it out of the pool.
More information about the freebsd-fs
mailing list