svn commit: r315449 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs
Allan Jude
allanjude at freebsd.org
Wed Apr 11 05:04:59 UTC 2018
On 2018-02-25 22:56, Allan Jude wrote:
> On 2017-03-17 08:34, Steven Hartland wrote:
>> Author: smh
>> Date: Fri Mar 17 12:34:57 2017
>> New Revision: 315449
>> URL: https://svnweb.freebsd.org/changeset/base/315449
>>
>> Log:
>> Reduce ARC fragmentation threshold
>>
>> As ZFS can request up to SPA_MAXBLOCKSIZE memory block e.g. during zfs recv,
>> update the threshold at which we start agressive reclamation to use
>> SPA_MAXBLOCKSIZE (16M) instead of the lower zfs_max_recordsize which
>> defaults to 1M.
>>
>> PR: 194513
>> Reviewed by: avg, mav
>> MFC after: 1 month
>> Sponsored by: Multiplay
>> Differential Revision: https://reviews.freebsd.org/D10012
>>
>> Modified:
>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
>>
>> Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
>> ==============================================================================
>> --- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Fri Mar 17 12:34:56 2017 (r315448)
>> +++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Fri Mar 17 12:34:57 2017 (r315449)
>> @@ -3978,7 +3978,7 @@ arc_available_memory(void)
>> * Start aggressive reclamation if too little sequential KVA left.
>> */
>> if (lowest > 0) {
>> - n = (vmem_size(heap_arena, VMEM_MAXFREE) < zfs_max_recordsize) ?
>> + n = (vmem_size(heap_arena, VMEM_MAXFREE) < SPA_MAXBLOCKSIZE) ?
>> -((int64_t)vmem_size(heap_arena, VMEM_ALLOC) >> 4) :
>> INT64_MAX;
>> if (n < lowest) {
>>
>
> I have some users reporting excessive ARC shrinking in 11.1 vs 11.0 due
> to this change.
>
> Memory seems quite fragmented, and this change makes it much more
> sensitive to that, but the problem seems to be that is can get to
> aggressive.
>
> The most recent case, the machine has 128GB of ram, and no other major
> processes running, just ZFS zvols being served over iSCIS by ctld.
>
> arc_max set to 85GB, rather conservative. After running for a few days,
> fragmentation seems to trip this line, when there are no 16mb contiguous
> blocks, and it shrinks the ARC by 1/16th of memory, but this does not
> result in a 16mb contiguous chunk, so it shrinks the ARC by another
> 1/16th, and again until it hits arc_min. Apparently eventually the ARC
> does regrow, but then crashes again later.
>
> You can see the ARC oscillating between arc_max and arc_min, with some
> long periods pinned at arc_min: https://imgur.com/a/emztF
>
>
> [root at ZFS-AF ~]# vmstat -z | tail +3 | awk -F '[:,] *' 'BEGIN { total=0;
> cache=0; used=0 } {u = $2 * $4; c = $2 * $5; t = u + c; cache += c; used
> += u; total += t; name=$1; gsub(" ", "_", name); print t, name, u, c}
> END { print total, "TOTAL", used, cache } ' | sort -n | perl -a -p -e
> 'while (($j, $_) = each(@F)) { 1 while s/^(-?\d+)(\d{3})/$1,$2/; print
> $_, " "} print "\n"' | column -t | tail
TOTAL NAME USED Cache
> 1,723,367,424 zio_data_buf_49152 1,722,875,904 491,520
> 1,827,057,664 zio_buf_4096 1,826,848,768 208,896
> 2,289,459,200 zio_data_buf_40960 2,289,090,560 368,640
> 3,642,736,640 zio_data_buf_81920 3,642,408,960 327,680
> 6,713,180,160 zio_data_buf_98304 6,712,688,640 491,520
> 9,388,195,840 zio_buf_8192 9,388,064,768 131,072
> 11,170,152,448 zio_data_buf_114688 11,168,890,880 1,261,568
> 29,607,329,792 zio_data_buf_131072 29,606,674,432 655,360
> 32,944,750,592 zio_buf_65536 32,943,833,088 917,504
> 114,235,296,752 TOTAL 111,787,212,900 2,448,083,852
>
>
> [root at ZFS-AF ~]# vmstat -z | tail +3 | awk -F '[:,] *' 'BEGIN { total=0;
> cache=0; used=0 } {u = $2 * $4; c = $2 * $5; t = u + c; cache += c; used
> += u; total += t; name=$1; gsub(" ", "_", name); print t, name, u, c}
> END { print total, "TOTAL", used, cache } ' | sort -n +3 | perl -a -p -e
> 'while (($j, $_) = each(@F)) { 1 while s/^(-?\d+)(\d{3})/$1,$2/; print
> $_, " "} print "\n"' | column -t | tail
Sorted by cache (waste)
TOTAL NAME USED Cache
> 71,565,312 cblk15 0 71,565,312
> 72,220,672 cblk16 0 72,220,672
> 72,351,744 cblk18 131,072 72,220,672
> 72,744,960 cblk3 0 72,744,960
> 75,497,472 cblk8 0 75,497,472
> 76,283,904 cblk22 0 76,283,904
> 403,696,384 128 286,225,792 117,470,592
> 229,519,360 mbuf_jumbo_page 67,043,328 162,476,032
> 1,196,795,160 arc_buf_hdr_t_l2only 601,620,624 595,174,536
> 114,220,354,544 TOTAL 111,778,349,508 2,442,005,036
>
>
> Maybe the right thing to do is call the new kmem_cache_reap_soon() or
> other functions that might actually reduce fragmentation, or rate limit
> how quickly the ARC will shrink?
>
> What kind of tools do we have to look at why memory is so fragmented
> that ZFS feels the need to tank the ARC?
>
>
>
> I know this block and the FMR_ZIO_FRAG reason have been removed from
> -CURRENT as part of the NUMA work, but I am worried about addressing
> this issue for the upcoming 11.2-RELEASE.
>
>
>
Does anyone have any thoughts on this? The 11.2 code slush starts in 1
week, so we really need to decide what to do here.
--
Allan Jude
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 834 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freebsd.org/pipermail/svn-src-all/attachments/20180411/df8f5039/attachment.sig>
More information about the svn-src-all
mailing list