slowdown of zfs (tx->tx)

Nicolas Rachinsky fbsd-mas-0 at ml.turing-complete.org
Mon Jan 14 19:51:50 UTC 2013


* Artem Belevich <art at freebsd.org> [2013-01-14 11:13 -0800]:
> On Mon, Jan 14, 2013 at 1:40 AM, Nicolas Rachinsky
> <fbsd-mas-0 at ml.turing-complete.org> wrote:
> >   5 Reallocated_Sector_Ct   0x0033   094   094   010    Pre-fail  Always       -       166
> > 195 Hardware_ECC_Recovered  0x001a   100   100   000    Old_age   Always       -       1259614646
> > 196 Reallocated_Event_Count 0x0032   096   096   000    Old_age   Always       -       166
> 
> > Reallocated_Sector_Ct did not increase during the last days.
> 
> It does not matter IMHO. That hard drive already got quite a few bad
> sectors that ECC could not deal with. There are apparently more
> marginally bad sectors, but ECC deals with it for now. Once enough
> bits rot, you'll get more bad sectors. I personally would replace the
> drive.

Yes, I'll do that. 

> >> Cound you do gstat with 1-second interval. Some of the 5-second
> >> samples show that ada8 is the bottleneck -- it has its request queue
> >> full (L(q)=10) when all other drives were done with their jobs. And
> >> that's a 5-sec average. Its write service time also seems to be a lot
> >> higher than for other drives.
> >
> > Attached.  I have replace ada8 by ada9, which is a Western Digital
> > Caviar Black.
> >
> > Now ada0 and ada4 seem to be the bottleneck.
> >
> > But I don't understand the intervalls without any disk activity.
> 
> It is puzzling. Is rsync still sleeping in tx->tx state? Try running
> "procstat -kk <rsync-PID>" periodically. It will print in-kernel stack
> trace and may help giving a clue where/why rsync is stuck.

# sh -c 'for i in `jot 100`; do procstat -kk 36639 ; sleep 1; done' | sort | uniq -c
 100   PID    TID COMM             TDNAME           KSTACK                       
   1 36639 100574 rsync            -                <running>                    
  99 36639 100574 rsync            -                mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 txg_wait_open+0x85 zfs_freebsd_write+0x3a6 VOP_WRITE_APV+0xb2 vn_write+0x373 dofilewrite+0x8b kern_writev+0x60 write+0x55 amd64_syscall+0x1f4 Xfast_syscall+0xfc 

# sh -c 'for i in `jot 100`; do procstat -kk 36639 ; sleep 0.36; done' | sort | uniq -c
 100   PID    TID COMM             TDNAME           KSTACK                       
   1 36639 100574 rsync            -                mi_switch+0x176 sleepq_timedwait+0x42 _cv_timedwait+0x134 txg_delay+0x137 dsl_pool_tempreserve_space+0xd5 dsl_dir_tempreserve_space+0x154 dmu_tx_assign+0x370 zfs_freebsd_write+0x38a VOP_WRITE_APV+0xb2 vn_write+0x373 dofilewrite+0x8b kern_writev+0x60 write+0x55 amd64_syscall+0x1f4 Xfast_syscall+0xfc 
  99 36639 100574 rsync            -                mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 txg_wait_open+0x85 zfs_freebsd_write+0x3a6 VOP_WRITE_APV+0xb2 vn_write+0x373 dofilewrite+0x8b kern_writev+0x60 write+0x55 amd64_syscall+0x1f4 Xfast_syscall+0xfc 

# sh -c 'for i in `jot 100`; do procstat -kk 36639 ; sleep 0.1; done' | sort | uniq -c
 100   PID    TID COMM             TDNAME           KSTACK                       
 100 36639 100574 rsync            -                mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 txg_wait_open+0x85 zfs_freebsd_write+0x3a6 VOP_WRITE_APV+0xb2 vn_write+0x373 dofilewrite+0x8b kern_writev+0x60 write+0x55 amd64_syscall+0x1f4 Xfast_syscall+0xfc 


Thanks in advance

Nicolas

-- 
http://www.rachinsky.de/nicolas


More information about the freebsd-fs mailing list