Another ZFS ARC memory question

Tom Evans tevans.uk at googlemail.com
Fri Feb 24 12:59:02 UTC 2012


On Fri, Feb 24, 2012 at 12:44 PM, Luke Marsden
<luke-lists at hybrid-logic.co.uk> wrote:
> On Fri, 2012-02-24 at 12:21 +0000, Tom Evans wrote:
>> On Fri, Feb 24, 2012 at 11:06 AM, Luke Marsden
>> <luke-lists at hybrid-logic.co.uk> wrote:
>> > Hi all,
>> >
>> > Just wanted to get your opinion on best practices for ZFS.
>> >
>> > We're running 8.2-RELEASE v15 in production on 24GB RAM amd64 machines
>> > but have been having trouble with short spikes in application memory
>> > usage resulting in huge amounts of swapping, bringing the whole machine
>> > to its knees and crashing it hard.  I suspect this is because when there
>> > is a sudden spike in memory usage the zfs arc reclaim thread is unable
>> > to free system memory fast enough.
>> >
>> > This most recently happened yesterday as you can see from the following
>> > munin graphs:
>> >
>> > E.g. http://hybrid-logic.co.uk/memory-day.png
>> >     http://hybrid-logic.co.uk/swap-day.png
>> >
>> > Our response has been to start limiting the ZFS ARC cache to 4GB on our
>> > production machines - trading performance for stability is fine with me
>> > (and we have L2ARC on SSD so we still get good levels of caching).
>> >
>> > My questions are:
>> >
>> >      * is this a known problem?
>> >      * what is the community's advice for production machines running
>> >        ZFS on FreeBSD, is manually limiting the ARC cache (to ensure
>> >        that there's enough actually free memory to handle a spike in
>> >        application memory usage) the best solution to this
>> >        spike-in-memory-means-crash problem?
>> >      * has FreeBSD 9.0 / ZFS v28 solved this problem?
>> >      * rather than setting a hard limit on the ARC cache size, is it
>> >        possible to adjust the auto-tuning variables to leave more free
>> >        memory for spiky memory situations?  e.g. set the auto-tuning to
>> >        make arc eat 80% of memory instead of ~95% like it is at
>> >        present?
>> >      * could the arc reclaim thread be made to drop ARC pages with
>> >        higher priority before the system starts swapping out
>> >        application pages?
>> >
>> > Thank you for any/all answers, and thank you for making FreeBSD
>> > awesome :-)
>>
>> It's not a problem, it's a feature!
>>
>> By default the ARC will attempt to cache as much as it can - it
>> assumes the box is a ZFS filer, and doesn't need RAM for applications.
>> The solution, as you've found out, is to limit how much ARC can take
>> up.
>>
>> In practice, you should be doing this anyway. You should know, or have
>> an idea, of how much RAM is required for the applications on that box,
>> and you need to limit ZFS to not eat into that required RAM.
>
> Thanks for your reply, Tom!  I agree that the ARC cache is a great
> feature, but for a general purpose filesystem it does seem like a
> reasonable expectation that filesystem cache will be evicted before
> application data is swapped, even if the spike in memory usage is rather
> aggressive.  A complete server crash in this scenario is rather
> unfortunate.
>
> My question stands - is this an area which has been improved on in the
> ZFS v28 / FreeBSD 9.0 / upcoming FreeBSD 8.3 code, or should it be
> standard practice to guess how much memory the applications running on
> the server might need and set the arc_max boot.loader tweak
> appropriately?  This is reasonably tricky when providing general purpose
> web application hosting and so we'll often end up erring on the side of
> caution and leaving lots of RAM free "just in case".
>
> If the latter is indeed the case in the latest stable releases then I
> would like to update http://wiki.freebsd.org/ZFSTuningGuide which
> currently states:
>
>        FreeBSD 7.2+ has improved kernel memory allocation strategy and
>        no tuning may be necessary on systems with more than 2 GB of
>        RAM.
>
> Thank you!
>
> Best Regards,
> Luke Marsden
>

Hmm. That comment is really talking about that you no longer need to
tune vm.kmem_size.

I get what you are saying about applications suddenly using a lot of
RAM should not cause the server to fall over. Do you know why it fell
over? IE, was it a panic, a deadlock, etc.

FreeBSD does not cope well when you have used up all RAM and swap
(well, what does?), and from your graphs it does look like the ARC is
not super massive when you had the problem - around 30-40% of RAM?

Cheers

Tom


More information about the freebsd-fs mailing list