Zfs heavy io writing | zfskern txg_thread_enter

Niccolò Corvini n.corvini at gmail.com
Fri Feb 19 11:07:04 UTC 2016


Hi, first time here!
We are having a problem with a server running FreeBsd 9.1 with ZFS on a
single sata drive. Since a few days ago, in the morning the system becomes
really slow due of a really heavy io writing. We investigated and we think
it might start at night, maybe correlated to to crondaily (standard) but we
are not sure.  After a few hours the situation returns to normal.
Any help is much appreciated
The machine is a Intel Xeon E5-2620 with 36GB of RAM, the HDD is a 2TB an
is half full.
gstat output:

 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
   13    135     21    641  256.7    108   6410   41.4  128.8| ada0
    0      0      0      0    0.0      0      0    0.0    0.0| ada0p1
   13    135     21    641  256.7    108   6410   41.7  128.8| ada0p2
    0      0      0      0    0.0      0      0    0.0    0.0| cd0
    0      0      0      0    0.0      0      0    0.0    0.0|
gptid/3c0de011-4f37-11e5-8217-3085a91c3292
    0      0      0      0    0.0      0      0    0.0    0.0|
zvol/zroot/swap
   13    135     21    641  256.7    108   6410   41.7  128.9| gpt/disk1

Using top -m io shows that the responsible is [zfskern{txg_thread_enter}]
top -m io output:

 PID JID USERNAME   VCSW  IVCSW   READ  WRITE  FAULT  TOTAL PERCENT COMMAND
    3   0 root         14      1      0     37      0     37  30.33%
[zfskern{txg_thread_enter}]
49866 215   7070       26      2      0      5      0      5   4.10%
postgres: stats collector process    (postgres)
99901   5     70       42      0      0      4      0      4   3.28%
postgres: promeditec promeditec.osr.test 192.168.0.246(278
24820 199 www          10      0      7      0      0      7   5.74%
[jsvc{jsvc}]
33869 212     88       19      2      0      2      0      2   1.64%
[mysqld{mysqld}]
93400   0 root         13      0     10      0      0     10   8.20% [find]
89407 215   7070       10      0      0      1      0      1   0.82%
postgres: alfresco alfconservazione.dotcom.ts.it 192.168.0
15776   5     70       11      0      0      4      0      4   3.28%
postgres: stats collector process    (postgres)
33869 212     88       10      0      0      3      0      3   2.46%
[mysqld{mysqld}]
33869 212     88        2      0      0     11      0     11   9.02%
[mysqld{mysqld}]
18685 198 root          5      0      0      2      0      2   1.64%
/usr/sbin/syslogd -s
15852 214     70        4      1      0      1      0      1   0.82%
postgres: alfresco alfcomunets.dotcom.ts.it 192.168.0.212(
98335 120 root         11      0     29      0      0     29  23.77% find
/var/log -name messages.* -mtime -2
16128 214     70        8      0      0      1      0      1   0.82%
postgres: alfresco alfaxErre8 192.168.0.208(50558)  (postg
 1116 198 root         10      0      0      1      0      1   0.82%
sendmail: ./u1J9k90d001112 local: client DATA status (send
 1120 198 root          7      0      0      4      0      4   3.28%
mail.local -l

Using procstat -kk on the zfskern pid shows:

 PID    TID COMM             TDNAME           KSTACK
    3 100129 zfskern          arc_reclaim_thre mi_switch sleepq_timedwait
_cv_timedwait arc_reclaim_thread fork_exit fork_trampoline
    3 100130 zfskern          l2arc_feed_threa mi_switch sleepq_timedwait
_cv_timedwait l2arc_feed_thread fork_exit fork_trampoline
    3 100504 zfskern          txg_thread_enter mi_switch sleepq_wait
_cv_wait txg_thread_wait txg_quiesce_thread fork_exit fork_trampoline
    3 100505 zfskern          txg_thread_enter mi_switch sleepq_wait
_cv_wait zio_wait dsl_pool_sync spa_sync txg_sync_thread fork_exit
fork_trampoline
    3 100506 zfskern          zvol zroot/swap  mi_switch sleepq_wait _sleep
zvol_geom_worker fork_exit fork_trampoline

systat -vmstat

    7 users    Load  0.50  0.62  1.46                  Feb 19 11:46

Mem:KB    REAL            VIRTUAL                       VN PAGER   SWAP
PAGER
        Tot   Share      Tot    Share    Free           in   out     in
out
Act  41318k  199492 1251183k   588716 2254748  count
All  46844k  229048   -1968M   892088          pages
Proc:                                                            Interrupts
  r   p   d   s   w   Csw  Trp  Sys  Int  Sof  Flt        cow    2836 total
             4k      8516  393  12k  236 1445  11k  11281 zfod
atkbd0 1
                                                          ozfod       acpi0
9
 4.4%Sys   0.0%Intr  0.3%User  0.0%Nice 95.4%Idle        %ozfod       ehci0
17
|    |    |    |    |    |    |    |    |    |    |       daefr       ehci1
23
==                                                  11171 prcfr    79
cpu0:timer
                                           dtbuf    11265 totfr       isci0
264
Namei     Name-cache   Dir-cache   1095774 desvn          react    24
em0:rx 0
   Calls    hits   %    hits   %    409282 numvn          pdwak    16
em0:tx 0
      78      41  53                273943 frevn          pdpgs
em0:link
                                                          intrn   196 ahci0
278
Disks  ada0   cd0 pass0 pass1                    20445132 wire    267
cpu21:time
KB/t   2.55  0.00  0.00  0.00                    37317552 act      55
cpu13:time
tps     223     0     0     0                     4948708 inact    86
cpu5:timer
MB/s   0.56  0.00  0.00  0.00                      884804 cache    24
cpu12:time
%busy    94     0     0     0                     1370012 free     63
cpu10:time
                                                          buf     306
cpu19:time
                                                                   63
cpu11:time
                                                                   55
cpu14:time
                                                                   86
cpu9:timer
                                                                   71
cpu18:time
                                                                   86
cpu3:timer
                                                                   47
cpu23:time
                                                                   55
cpu6:timer
                                                                   55
cpu22:time
                                                                   71
cpu2:timer
                                                                   39
cpu20:time

zpool status:
  pool: zroot
 state: ONLINE
  scan: scrub repaired 0 in 3h46m with 0 errors on Wed Nov  4 21:54:44 2015
config:

        NAME         STATE     READ WRITE CKSUM
        zroot        ONLINE       0     0     0
          gpt/disk1  ONLINE       0     0     0

errors: No known data errors


More information about the freebsd-fs mailing list