From nobody Tue May 18 21:07:44 2021 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 8BB4F5AA44D for ; Tue, 18 May 2021 21:07:57 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-ot1-f51.google.com (mail-ot1-f51.google.com [209.85.210.51]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Fl7rD6JY3z3Kdy; Tue, 18 May 2021 21:07:56 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-ot1-f51.google.com with SMTP id s5-20020a05683004c5b029032307304915so2618025otd.7; Tue, 18 May 2021 14:07:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=BeFNe9yNuSUiRTOD5Qk8stBSNEshgrk8e+CVuFUNfDc=; b=m5wtME1NCwATWH7tSQvRYfDCE19BmXp1j/ThNk6CJOCx/KPPMBdDAcA5MeKJMCEpuH GfSVAk964NShp4/An+OygCuwi8qPqOroLTPMRHEIH02v8Mc8COHHtbk0Vzav7PBRoMmV 9583EuX/LsGnC9ieDQUar1jTgfRVd4ijoZNrPMQa/VRiXywQ10FqRBuxJ1+CNIYo7Zbc YaS5PmmNhtQVC+ebMkfjd+gF+oVIbvVSWmzGEcUnJOdCPQqEHfn7XJrXDuRr2aIWuA1F aDfrqK+mzRst3MU32Iwfkh0mD6c/WL0LZVhKiv5TnZ9g/74PhFpzMr4D5TJ0mNmP5v7a NRyA== X-Gm-Message-State: AOAM5315Rd8nGxxBQn69lil+6YeNHtXGfhnTRqP0ewh54jbk2F4vjR0L aDbVLPCjhTRtVgnUVTKkNK6ALr9YNMxhs6Zjm7Umu+zY8guQgw== X-Google-Smtp-Source: ABdhPJya2HxLS11dDcEEwos+Lkl5KkIkKIgl9UL6HEAE6AMrYXQP9uzm2cr6/DXdNxlSP9fWuQJ8SODP+2UPo7mbRcI= X-Received: by 2002:a05:6830:3115:: with SMTP id b21mr5559855ots.291.1621372075239; Tue, 18 May 2021 14:07:55 -0700 (PDT) List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org MIME-Version: 1.0 From: Alan Somers Date: Tue, 18 May 2021 15:07:44 -0600 Message-ID: Subject: The pagedaemon evicts ARC before scanning the inactive page list To: FreeBSD Hackers , Mark Johnston Content-Type: multipart/alternative; boundary="00000000000097eef105c2a11a6b" X-Rspamd-Queue-Id: 4Fl7rD6JY3z3Kdy X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of asomers@gmail.com designates 209.85.210.51 as permitted sender) smtp.mailfrom=asomers@gmail.com X-Spamd-Result: default: False [-2.76 / 15.00]; RCVD_TLS_ALL(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; FREEFALL_USER(0.00)[asomers]; FROM_HAS_DN(0.00)[]; RWL_MAILSPIKE_GOOD(0.00)[209.85.210.51:from]; TO_MATCH_ENVRCPT_ALL(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; DMARC_NA(0.00)[freebsd.org]; ARC_NA(0.00)[]; SPAMHAUS_ZRD(0.00)[209.85.210.51:from:127.0.2.255]; NEURAL_HAM_MEDIUM(-1.00)[-0.999]; TO_DN_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-0.998]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[209.85.210.51:from]; NEURAL_HAM_SHORT(-0.76)[-0.763]; RBL_DBL_DONT_QUERY_IPS(0.00)[209.85.210.51:from]; FORGED_SENDER(0.30)[asomers@freebsd.org,asomers@gmail.com]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; FROM_NEQ_ENVFROM(0.00)[asomers@freebsd.org,asomers@gmail.com]; MAILMAN_DEST(0.00)[freebsd-hackers]; RCVD_COUNT_TWO(0.00)[2] X-Spam: Yes --00000000000097eef105c2a11a6b Content-Type: text/plain; charset="UTF-8" I'm using ZFS on servers with tons of RAM and running FreeBSD 12.2-RELEASE. Sometimes they get into a pathological situation where most of that RAM sits unused. For example, right now one of them has: 2 GB Active 529 GB Inactive 16 GB Free 99 GB ARC total 469 GB ARC max 86 GB ARC target When a server gets into this situation, it stays there for days, with the ARC target barely budging. All that inactive memory never gets reclaimed and put to a good use. Frequently the server never recovers until a reboot. I have a theory for what's going on. Ever since r334508^ the pagedaemon sends the vm_lowmem event _before_ it scans the inactive page list. If the ARC frees enough memory, then vm_pageout_scan_inactive won't need to free any. Is that order really correct? For reference, here's the relevant code, from vm_pageout_worker: shortage = pidctrl_daemon(&vmd->vmd_pid, vmd->vmd_free_count); if (shortage > 0) { ofree = vmd->vmd_free_count; if (vm_pageout_lowmem() && vmd->vmd_free_count > ofree) shortage -= min(vmd->vmd_free_count - ofree, (u_int)shortage); target_met = vm_pageout_scan_inactive(vmd, shortage, &addl_shortage); } else addl_shortage = 0 Raising vfs.zfs.arc_min seems to workaround the problem. But ideally that wouldn't be necessary. -Alan ^ https://svnweb.freebsd.org/base?view=revision&revision=334508 --00000000000097eef105c2a11a6b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I'm using ZFS on servers with tons of RAM and run= ning FreeBSD 12.2-RELEASE.=C2=A0 Sometimes they get into a pathological sit= uation where most of that RAM sits unused.=C2=A0 For example, right now one= of them has:

2 GB=C2=A0=C2=A0 Active
529 GB Inactive
16 GB=C2=A0 Free
9= 9 GB=C2=A0 ARC total
469 GB ARC max
= 86 GB=C2=A0 ARC target

When a server ge= ts into this situation, it stays there for days, with the ARC target barely= budging.=C2=A0 All that inactive memory never gets reclaimed and put to a = good use.=C2=A0 Frequently the server never recovers until a reboot.

I have a theory for what's going on.=C2=A0 Ever = since r334508^ the pagedaemon sends the vm_lowmem event _before_ it scans t= he inactive page list.=C2=A0 If the ARC frees enough memory, then vm_pageou= t_scan_inactive won't need to free any.=C2=A0 Is that order really corr= ect?=C2=A0 For reference, here's the relevant code, from vm_pageout_wor= ker:

short= age =3D pidctrl_daemon(&vmd->vmd_pid, vmd->vmd_free_count);
= if (shortage > 0) {
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ofree = =3D vmd->vmd_free_count;
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 i= f (vm_pageout_lowmem() && vmd->vmd_free_count > ofree)
=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 shortage -=3D min(vmd->vmd_free_count - ofree,
=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (u_int)short= age);
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 target_met =3D vm_pageo= ut_scan_inactive(vmd, shortage,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 &addl_shortage);
} else
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0 addl_shortage =3D 0

Raising vfs.z= fs.arc_min seems to workaround the problem.=C2=A0 But ideally that wouldn&#= 39;t be necessary.

-Alan

<= div>^ https://svnweb.freebsd.org/base?view=3Drevision&revision= =3D334508
--00000000000097eef105c2a11a6b--