Reoccuring ZFS performance problems

Steven Hartland killing at multiplay.co.uk
Mon Mar 10 17:32:25 UTC 2014


Looks like you may be out of IOP/s but just incase, are you using TRIM at all?
sysctl -a |grep trim

If you are what does "gstat -d" show?

You also mention your using mysql, have you applied the standard tuning
for mysql on ZFS?

    Regards
    Steve
----- Original Message ----- 
From: "Adrian Gschwend" <ml-ktk at netlabs.org>
To: <freebsd-fs at freebsd.org>
Sent: Monday, March 10, 2014 5:05 PM
Subject: Reoccuring ZFS performance problems


> Hi group,
> 
> (I have a lot of pastes in here, see http://pastebin.com/yjQnLryP for
> this email in case the mail kills too long lines)
> 
> On a regular base I run into some very weird ZFS performance issues on
> my server. When it happens file IO is terribly slow and even a simple ls
> can take a long time (worst case up to minutes). Everything which relies
> on file IO is basically dead in this mode so even starting top or other
> tools is a PITA. This stage can stay from minutes to hours and goes back
> to normal after some random time. A reboot does not necessarily fix it,
> often I'm back in exactly this stage after reboot.
> 
> I do not see any patterns when it happens in my monitoring (munin), in
> terms of when munin starts to time out because of it I do not see peaks
> upfront in any of the system graphs I do.
> 
> When I run 'top' in this mode I see many processes in one of these states:
> 
> tx->tx
> zfs
> umtxn (mainly on mysql, which is unkillable in this mode)
> uwait
> 
> 
> Setup:
> * FreeBSD 9.2-RELEASE-p3 in a KVM (SmartOS Solaris host, running ZFS itself)
> * I'm using mfsbsd to do a ZFS only system
> * No specific ZFS changes (I did play with some, see last part of email)
> * There are 5 jails running
> * I run various Apaches (PHP/SVN/TRAC etc), MySQL, LDAP daemon, a JVM
> and some SIP servers (Freeswitch)
> * Normal load is around 20-30% (2 cores)
> * Swap is currently on 1% usage (4G available)
> * I have 16GB of memory available, munin still shows around 1-2 GB as
> free. It can be that the issue happens faster with less memory but I
> cannot prove it either.
> * Currently no dtrace enabled so can't get much further than standard
> tools shipped with BSD
> * zpool status does not report any failures
> 
> The issues are not new, they first appeared while the system was still
> running on real hardware (FBSD 8.x) and not within a KVM. Back then I
> assumed I have a hardware problem but the problem reappeared on the
> virtualized install. This install was basically a 1:1 zfs send copy plus
> some bootloader hacking of the old system so exact same software levels.
> I switched to a new install on 9.x and had the issue on every single
> release there as well. I did not try 10.x yet.
> 
> fstat | wc -l:
> 
> 7068
> 
> (took forever to let it finish)
> 
> gstat gives me:
> 
> dT: 1.010s  w: 1.000s
> L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
>   10    575      1      0    0.2    574    653   17.8  100.4| vtbd0
>    0      0      0      0    0.0      0      0    0.0    0.0|
> PART/vtbd0/vtbd0
>    0      0      0      0    0.0      0      0    0.0    0.0| vtbd0p1
>    0      0      0      0    0.0      0      0    0.0    0.0| vtbd0p2
>   10    575      1      0    0.3    574    653   18.0  100.4| vtbd0p3
>    0      0      0      0    0.0      0      0    0.0    0.0|
> DEV/vtbd0/vtbd0
>    0      0      0      0    0.0      0      0    0.0    0.0| cd0
>    0      0      0      0    0.0      0      0    0.0    0.0|
> DEV/vtbd0p1/vtbd0p1
>    0      0      0      0    0.0      0      0    0.0    0.0|
> LABEL/vtbd0p1/vtbd0p1
>    0      0      0      0    0.0      0      0    0.0    0.0|
> gptid/e402ecce-89ca-11e2-a867-3264262b9894
>    0      0      0      0    0.0      0      0    0.0    0.0|
> DEV/vtbd0p2/vtbd0p2
>    0      0      0      0    0.0      0      0    0.0    0.0|
> LABEL/vtbd0p2/vtbd0p2
>    0      0      0      0    0.0      0      0    0.0    0.0|
> gptid/e4112d88-89ca-11e2-a867-3264262b9894
>    0      0      0      0    0.0      0      0    0.0    0.0|
> DEV/vtbd0p3/vtbd0p3
>    0      0      0      0    0.0      0      0    0.0    0.0|
> SWAP/swap/gptid/e4112d88-89ca-11e2-a867-3264262b9894
>    0      0      0      0    0.0      0      0    0.0    0.0| DEV/cd0/cd0
>    0      0      0      0    0.0      0      0    0.0    0.0|
> DEV/gptid/e402ecce-89ca-11e2-a867-3264262b9894/gptid/e402ecce-89ca
> -11e2-a867-3264262b9894
>    0      0      0      0    0.0      0      0    0.0    0.0|
> DEV/gptid/e4112d88-89ca-11e2-a867-3264262b9894/gptid/e4112d88-89ca
> -11e2-a867-3264262b9894
>    0      0      0      0    0.0      0      0    0.0    0.0|
> ZFS::VDEV/zfs::vdev/vtbd0p3
> 
> ms/w changes a lot, highest I've seen right now was around 70
> 
> zfs iostat 2:
> 
>               capacity     operations    bandwidth
> pool        alloc   free   read  write   read  write
> ----------  -----  -----  -----  -----  -----  -----
> tank         173G  21.5G      3     56  30.9K   535K
> tank         173G  21.5G     62    376  49.0K  1.22M
> tank         173G  21.5G     46    340  84.3K   565K
> tank         173G  21.5G     45    566  74.6K   800K
> tank         173G  21.5G     32    222  92.0K   958K
> tank         173G  21.5G     63    392   120K  1.10M
> tank         173G  21.5G     16    286  14.2K   338K
> tank         173G  21.5G     29    313  24.6K   831K
> tank         173G  21.5G      0    289      0   445K
> tank         173G  21.5G     27    244  32.6K   293K
> tank         173G  21.5G     43    385  42.8K   477K
> tank         173G  21.5G     31    329  15.7K   710K
> tank         173G  21.5G     65    394  46.8K  1.50M
> tank         173G  21.5G     80    320   127K   754K
> tank         173G  21.5G     30    425   144K  1.09M
> tank         173G  21.5G     13    399  25.9K   379K
> tank         173G  21.5G     10    194  5.22K   162K
> tank         173G  21.5G     18    311  45.5K  1.02M
> tank         173G  21.5G     29    202  58.5K   344K
> tank         173G  21.5G     32    375   108K   926K
> 
> 
> on the host OS (SmartOS) zpool iostat 2 shows me (there is one other
> FBSD box but there is almost no IO on this one):
> 
>               capacity     operations    bandwidth
> pool        alloc   free   read  write   read  write
> ----------  -----  -----  -----  -----  -----  -----
> zones        602G   230G      5    278  44.0K  2.42M
> zones        602G   230G      2    154  19.9K  2.18M
> zones        602G   230G      0  1.47K      0  10.3M
> zones        602G   230G      0    128      0  1.44M
> zones        602G   230G      0    270      0  2.61M
> zones        602G   230G      0  1.39K      0  10.2M
> zones        602G   230G      0    114  7.96K  2.10M
> zones        602G   230G      0    979  7.96K  7.84M
> 
> When the guest system is not in this state the writes are lower and I
> don't see modes where I have 0 on reads.
> 
> I was googling around about this topic forever and I do find people who
> report similar issues. No one I've contacted found a real explanation
> for it. Based on various guides I started adapting the basic config:
> 
> cat /boot/loader.conf:
> 
> vfs.zfs.zfetch.block_cap=64
> 
> # this one was horrible, bootup alone was dogslow
> #vfs.zfs.write_limit_override=1048576
> 
> #vfs.zfs.txg.timeout="5"
> 
> # so far good results?
> vfs.zfs.prefetch_disable="1"
> 
> First I thought disabling prefetch did solve the issue for a while. But
> it looks like I was too optimistic with that one. However, ls feels
> *much* faster when the system is happy since I disabled prefetch.
> 
> I'm really totally lost on this one so I would appreciate hints about
> how to debug that.
> 
> I'm willing to test whatever it takes to figure out where this issue is.
> 
> thanks
> 
> Adrian
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> freebsd-fs at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
>

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster at multiplay.co.uk.



More information about the freebsd-fs mailing list