ZFS perfomance regression in FreeBSD 12 APLHA3->ALPHA4

Jakob Alvermark jakob at alvermark.net
Sat Sep 8 10:40:10 UTC 2018


                        Total     MFU     MRU    Anon     Hdr L2Hdr   Other
      ZFS ARC            667M    186M    168M     13M   3825K 0K    295M

                                 rate    hits  misses   total hits total 
misses
      arcstats                  : 99%   65636     605 167338494      9317074
      arcstats.demand_data      : 57%     431     321 13414675      2117714
      arcstats.demand_metadata  : 99%   65175     193 152969480      5344919
      arcstats.prefetch_data    :  0%       0      30 3292       401344
      arcstats.prefetch_metadata: 32%      30      61 951047      1453097
      zfetchstats               :  9%     119    1077 612582     55041789
      arcstats.l2               :  0%       0       0 0            0
      vdev_cache_stats          :  0%       0       0 0            0




This is while a 'make -j8 buildworld' (it has 8 cores) is going.

SSH'ing to the machine while the buildworld is going it takes 40-60 
seconds to get to the shell!

Hitting ^T while waiting: load: 1.06  cmd: zsh 45334 
[arc_reclaim_waiters_cv] 56.11r 0.00u 0.10s 0% 5232k

I will test the patch below and report back.


Jakob

On 9/7/18 7:27 PM, Cy Schubert wrote:
> I'd be interested in seeing systat -z output.
>
> ---
> Sent using a tiny phone keyboard.
> Apologies for any typos and autocorrect.
> Also, this old phone only supports top post. Apologies.
>
> Cy Schubert
> <Cy.Schubert at cschubert.com> or <cy at freebsd.org>
> The need of the many outweighs the greed of the few.
> ---
> ------------------------------------------------------------------------
> From: Mark Johnston
> Sent: 07/09/2018 09:09
> To: Jakob Alvermark
> Cc: Subbsd; allanjude at freebsd.org; freebsd-current Current
> Subject: Re: ZFS perfomance regression in FreeBSD 12 APLHA3->ALPHA4
>
> On Fri, Sep 07, 2018 at 03:40:52PM +0200, Jakob Alvermark wrote:
> > On 9/6/18 2:28 AM, Mark Johnston wrote:
> > > On Wed, Sep 05, 2018 at 11:15:03PM +0300, Subbsd wrote:
> > >> On Wed, Sep 5, 2018 at 5:58 PM Allan Jude <allanjude at freebsd.org> 
> wrote:
> > >>> On 2018-09-05 10:04, Subbsd wrote:
> > >>>> Hi,
> > >>>>
> > >>>> I'm seeing a huge loss in performance ZFS after upgrading 
> FreeBSD 12
> > >>>> to latest revision (r338466 the moment) and related to ARC.
> > >>>>
> > >>>> I can not say which revision was before except that the newver.sh
> > >>>> pointed to ALPHA3.
> > >>>>
> > >>>> Problems are observed if you try to limit ARC. In my case:
> > >>>>
> > >>>> vfs.zfs.arc_max="128M"
> > >>>>
> > >>>> I know that this is very small. However, for two years with 
> this there
> > >>>> were no problems.
> > >>>>
> > >>>> When i send SIGINFO to process which is currently working with 
> ZFS, i
> > >>>> see "arc_reclaim_waiters_cv":
> > >>>>
> > >>>> e.g when i type:
> > >>>>
> > >>>> /bin/csh
> > >>>>
> > >>>> I have time (~5 seconds) to press several times 'ctrl+t' before 
> csh is executed:
> > >>>>
> > >>>> load: 0.70  cmd: csh 5935 [arc_reclaim_waiters_cv] 1.41r 0.00u 
> 0.00s 0% 3512k
> > >>>> load: 0.70  cmd: csh 5935 [zio->io_cv] 1.69r 0.00u 0.00s 0% 3512k
> > >>>> load: 0.70  cmd: csh 5935 [arc_reclaim_waiters_cv] 1.98r 0.00u 
> 0.01s 0% 3512k
> > >>>> load: 0.73  cmd: csh 5935 [arc_reclaim_waiters_cv] 2.19r 0.00u 
> 0.01s 0% 4156k
> > >>>>
> > >>>> same story with find or any other commans:
> > >>>>
> > >>>> load: 0.34  cmd: find 5993 [zio->io_cv] 0.99r 0.00u 0.00s 0% 2676k
> > >>>> load: 0.34  cmd: find 5993 [arc_reclaim_waiters_cv] 1.13r 0.00u 
> 0.00s 0% 2676k
> > >>>> load: 0.34  cmd: find 5993 [arc_reclaim_waiters_cv] 1.25r 0.00u 
> 0.00s 0% 2680k
> > >>>> load: 0.34  cmd: find 5993 [arc_reclaim_waiters_cv] 1.38r 0.00u 
> 0.00s 0% 2684k
> > >>>> load: 0.34  cmd: find 5993 [arc_reclaim_waiters_cv] 1.51r 0.00u 
> 0.00s 0% 2704k
> > >>>> load: 0.34  cmd: find 5993 [arc_reclaim_waiters_cv] 1.64r 0.00u 
> 0.00s 0% 2716k
> > >>>> load: 0.34  cmd: find 5993 [arc_reclaim_waiters_cv] 1.78r 0.00u 
> 0.00s 0% 2760k
> > >>>>
> > >>>> this problem goes away after increasing vfs.zfs.arc_max
> > >>>> _______________________________________________
> > >>>> freebsd-current at freebsd.org mailing list
> > >>>> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> > >>>> To unsubscribe, send any mail to 
> "freebsd-current-unsubscribe at freebsd.org"
> > >>>>
> > >>> Previously, ZFS was not actually able to evict enough dnodes to keep
> > >>> your arc_max under 128MB, it would have been much higher based 
> on the
> > >>> number of open files you had. A recent improvement from upstream ZFS
> > >>> (r337653 and r337660) was pulled in that fixed this, so setting an
> > >>> arc_max of 128MB is much more effective now, and that is causing the
> > >>> side effect of "actually doing what you asked it to do", in this 
> case,
> > >>> what you are asking is a bit silly. If you have a working set 
> that is
> > >>> greater than 128MB, and you ask ZFS to use less than that, it'll 
> have to
> > >>> constantly try to reclaim memory to keep under that very low bar.
> > >>>
> > >> Thanks for comments. Mark was right when he pointed to r338416 (
> > >> 
> https://svnweb.freebsd.org/base/head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c?r1=338416&r2=338415&pathrev=338416
> > >> ). Commenting aggsum_value returns normal speed regardless of the 
> rest
> > >> of the new code from upstream.
> > >> I would like to repeat that the speed with these two lines is not 
> just
> > >> slow, but _INCREDIBLY_ slow! Probably, this should be written in the
> > >> relevant documentation for FreeBSD 12+
> >
> > Hi,
> >
> > I am experiencing the same slowness when there is a bit of load on the
> > system (buildworld for example) which I haven't seen before.
>
> Is it a regression following a recent kernel update?
>
> > I have vfs.zfs.arc_max=2G.
> >
> > Top is reporting
> >
> > ARC: 607M Total, 140M MFU, 245M MRU, 1060K Anon, 4592K Header, 217M 
> Other
> >       105M Compressed, 281M Uncompressed, 2.67:1 Ratio
> >
> > Should I test the patch?
>
> I would be interested in the results, assuming it is indeed a
> regression.
> _______________________________________________
> freebsd-current at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe at freebsd.org"
>


More information about the freebsd-current mailing list