zfs arc and amount of wired memory

Charles Sprickman spork at bway.net
Thu Feb 9 01:32:07 UTC 2012


On Feb 8, 2012, at 7:11 PM, Miroslav Lachman wrote:

> Andriy Gapon wrote:
>> on 08/02/2012 12:31 Eugene M. Zheganin said the following:
>>> Hi.
>>> 
>>> On 08.02.2012 02:17, Andriy Gapon wrote:
>>>> [output snipped]
>>>> 
>>>> Thank you.  I don't see anything suspicious/unusual there.
>>>> Just case, do you have ZFS dedup enabled by a chance?
>>>> 
>>>> I think that examination of vmstat -m and vmstat -z outputs may provide some
>>>> clues as to what got all that memory wired.
>>>> 
>>> Nope, I don't have deduplication feature enabled.
>> 
>> OK.  So, did you have a chance to inspect vmstat -m and vmstat -z?
>> 
>>> By the way, today, after eating another 100M of wired memory this server hanged
>>> out with multiple non-stopping messages
>>> 
>>> swap_pager: indefinite wait buffer
>>> 
>>> Since it's swapping on zvol, it looks to me like it could be the mentioned in
>>> another thread here ("Swap on zvol - recommendable?") resource starvation issue;
>>> may be it happens faster when the ARC isn't limited.
>> 
>> It could be very well possible that swap on zvol doesn't work well when the
>> kernel itself is starved on memory.
>> 
>>> So I want to ask - how to report it and what should I include in such pr ?
>> 
>> I am leaving swap-on-zvol issue aside.  Your original problem doesn't seem to be
>> ZFS-related.  I suspect that you might be running into some kernel memory leak.
>>  If you manage to reproduce the high wired value again, then vmstat -m and
>> vmstat -z may provide some useful information.
>> 
>> In this vein, do you use any out-of-tree kernel modules?
>> Also, can you try to monitor your system to see when wired count grows?
> 
> I am seeing something similar on one of our machine. This is old 7.3 with ZFS v13, that's why I did not reported it.
> 
> The machine is used as storage for backups made by rsync. All is running fine for about 107 days. Then backups are slower and slower because of some strange memory situation.
> 
> Mem: 15M Active, 17M Inact, 3620M Wired, 420K Cache, 48M Buf, 1166M Free
> 
> ARC Size:
>         Current Size:             1769 MB (arcsize)
>         Target Size (Adaptive):   512 MB (c)
>         Min Size (Hard Limit):    512 MB (zfs_arc_min)
>         Max Size (Hard Limit):    3584 MB (zfs_arc_max)
> 
> The target size is going down to the min size and after few more days, the system is so slow, that I must reboot the machine. Then it is running fine for about 107 days and then it all repeat again.
> 
> You can see more on MRTG graphs
> http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/
> You can see links to other useful informations on top of the page (arc_summary, top, dmesg, fs usage, loader.conf)
> 
> There you can see nightly backups (higher CPU load started at 01:13), otherwise the machine is idle.
> 
> It coresponds with ARC target size lowering in last 5 days
> http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_arcstats_size.html
> 
> And with ARC metadata cache overflowing the limit in last 5 days
> http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_vfs_meta.html

I'm not having luck finding it, but there's some known issue that exists even in 8.2 where some 32-bit counter overflows or something. I don't truly remember the logic in it, but when you hit it, it's around 110 days or so.  Before it gets really bad (to the point where you either reboot or get some memory exhaustion panic), you can see zfs "evict skips" incrementing rapidly.  Looking at that graph, that would be my guess as to what's happening to you.  It's easy to check - run one of the arc stats scripts, look for "evict_skips", note the number and then run it a few minutes later.  If it increases by more than a few hundred, you've hit the bug.  You'll find at that point the kernel is no longer "evicting" ARC from the kernel and it will just continue to grow until bad things happen.

Charles

> 
> I don't know what's going on and I don't know if it is something know / fixed in newer releases. We are running a few more ZFS systems on 8.2 without this issue. But those systems are in different roles.
> 
> Miroslav Lachman
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"



More information about the freebsd-stable mailing list