slowdown of zfs (tx->tx)

Fri Jan 11 13:58:14 UTC 2013

----- Original Message ----- 
From: "Nicolas Rachinsky" <fbsd-mas-0 at ml.turing-complete.org>
To: "freebsd-fs" <freebsd-fs at freebsd.org>
Sent: Friday, January 11, 2013 11:11 AM
Subject: Re: slowdown of zfs (tx->tx)

>* Nicolas Rachinsky <fbsd-mas-0 at ml.turing-complete.org> [2013-01-10 20:39 +0100]:
>> after replacing one of the controllers, all problems seem to have
>> disappeared. Thank you very much for your advice!
> 
> Now the problem is back.
> 
> After changing the controller, there were no more timeouts logged.
> 
> No UDMA_CRC_Error_Count changed.
> 
> While the problem exists, top almost all the time shows:
> 
> last pid: 46322;  load averages:  0.90,  1.03,  0.98   up 0+11:07:55  08:28:41
> 39 processes:  1 running, 38 sleeping
> CPU:  0.0% user,  0.0% nice, 50.1% system,  0.0% interrupt, 49.9% idle
> Mem: 10M Active, 33M Inact, 7612M Wired, 23M Cache, 827M Buf, 234M Free
> Swap: 16G Total, 13M Used, 16G Free
> 
>  PID USERNAME   THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
>  926 root         1  44    0 28020K  2404K select  1   2:23  0.29% snmpd
> 41642 user1        1  44    0  5828K   204K tx->tx  0  20:53  0.00% rsync
> 41641 user1        1  44    0 29952K  3976K select  1  13:39  0.00% ssh
> 41640 user1        1  44    0  5828K   140K select  1   0:20  0.00% rsync
> 90399 user2        1  44    0 14020K   872K tx->tx  0   0:16  0.00% rsync
>  956 root         1  44    0 11808K   708K select  1   0:02  0.00% ntpd
> 1051 root         1  44    0  8356K   640K kqread  0   0:00  0.00% master
> 25713 root         1  44    0 38108K  3596K select  1   0:00  0.00% sshd
>  875 root         1  44    0  6920K   572K select  1   0:00  0.00% syslogd
> 1066 root         1  44    0  7976K   564K nanslp  1   0:00  0.00% cron
> 1058 postfix      1  44    0  8356K   792K kqread  1   0:00  0.00% qmgr
>  705 root         1  44    0  5248K   120K select  1   0:00  0.00% devd
> 25715 root         1  44    0 10248K  2828K pause   1   0:00  0.00% csh
> 1062 root         1  44    0 26176K   952K select  1   0:00  0.00% sshd
> 90401 user2        1  44    0 14020K   768K select  1   0:00  0.00% rsync
> 90400 user2        1  44    0 23808K   892K select  1   0:00  0.00% ssh
> 90372 user2        1  59    0  8344K   124K wait    0   0:00  0.00% sh
> 41619 user1        1  76    0  8344K    40K wait    1   0:00  0.00% sh
> 46322 root         1  44    0  9372K  1800K CPU1    1   0:00  0.00% top
> 89384 root         1  44    0  8344K   712K wait    0   0:00  0.00% sh
> 37854 root         1  45    0  8360K   472K piperd  1   0:00  0.00% sendmail
> 45382 postfix      1  44    0  8360K  1324K kqread  1   0:00  0.00% pickup
> 41608 root         1  76    0  8344K   440K wait    0   0:00  0.00% sh
> 25768 root         1  52    0 13440K  1716K nanslp  0   0:00  0.00% smartd
> 33599 root         1  50    0  8344K   452K wait    1   0:00  0.00% sh
> 33597 root         1  52    0  8344K   440K wait    1   0:00  0.00% sh
> 37855 root         1  44    0  8360K   468K piperd  0   0:00  0.00% postdrop
> 33591 root         1  44    0  7976K   524K piperd  1   0:00  0.00% cron
> 33595 root         1  46    0  8344K   436K wait    1   0:00  0.00% sh
> 33594 root         1  44    0  8344K   436K wait    1   0:00  0.00% sh
> 33592 root         1  45    0  7976K   524K piperd  1   0:00  0.00% cron
> 1106 root         1  76    0  6916K   352K ttyin   1   0:00  0.00% getty
> 1111 root         1  76    0  6916K   352K ttyin   1   0:00  0.00% getty
> 1107 root         1  76    0  6916K   352K ttyin   0   0:00  0.00% getty
> 1108 root         1  76    0  6916K   352K ttyin   0   0:00  0.00% getty
> 1112 root         1  76    0  6916K   352K ttyin   0   0:00  0.00% getty
> 1109 root         1  76    0  6916K   352K ttyin   1   0:00  0.00% getty
> 1113 root         1  76    0  6916K   352K ttyin   0   0:00  0.00% getty
> 1110 root         1  76    0  6916K   352K ttyin   0   0:00  0.00% getty
> 
> The result of
> sh -c "while :;do gstat -I 5s -b ;done" > gstat.txt & iostat -d -x -w 5 > iostat.txt & zpool iostat -v 5 > zpool.txt &
> is available via
> http://flummi.dauerreden.de/20130111/zpool.txt
> http://flummi.dauerreden.de/20130111/gstat.txt
> http://flummi.dauerreden.de/20130111/iostat.txt
> 

TBH looks like your just saturating your disks with the number of IOP's
your doing.

    Regards
    Steve

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster at multiplay.co.uk.