From nobody Sat Nov 20 03:35:52 2021 X-Original-To: freebsd-fs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 6D305189DF64 for ; Sat, 20 Nov 2021 03:36:03 +0000 (UTC) (envelope-from cross+freebsd@distal.com) Received: from relay.wiredblade.com (relay.wiredblade.com [168.235.105.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4Hwzhd6sZQz3Dnq; Sat, 20 Nov 2021 03:36:01 +0000 (UTC) (envelope-from cross+freebsd@distal.com) Received: from mail.distal.com (pool-108-48-165-176.washdc.fios.verizon.net [108.48.165.176]) by relay.wiredblade.com with ESMTPSA (version=TLSv1.2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256) ; Sat, 20 Nov 2021 03:35:55 +0000 Received: from smtpclient.apple ( [2001:420:c0c4:1001::51c]) by tristain.distal.com (OpenSMTPD) with ESMTPSA id 28fbc05f (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256:NO); Fri, 19 Nov 2021 22:35:54 -0500 (EST) Content-Type: text/plain; charset=utf-8 List-Id: Filesystems List-Archive: https://lists.freebsd.org/archives/freebsd-fs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-fs@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 15.0 \(3693.20.0.1.32\)) Subject: Re: swap_pager: cannot allocate bio From: Chris Ross In-Reply-To: <471B80F4-B8F4-4D5A-9DEB-3F1E00F42A68@distal.com> Date: Fri, 19 Nov 2021 22:35:52 -0500 Cc: Mark Johnston , freebsd-fs Content-Transfer-Encoding: quoted-printable Message-Id: References: <9FE99EEF-37C5-43D1-AC9D-17F3EDA19606@distal.com> <09989390-FED9-45A6-A866-4605D3766DFE@distal.com> <4E5511DF-B163-4928-9CC3-22755683999E@distal.com> <19A3AAF6-149B-4A3C-8C27-4CFF22382014@distal.com> <6DA63618-F0E9-48EC-AB57-3C3C102BC0C0@distal.com> <35c14795-3b1c-9315-8e9b-a8dfad575a04@FreeBSD.org> <471B80F4-B8F4-4D5A-9DEB-3F1E00F42A68@distal.com> To: Andriy Gapon X-Mailer: Apple Mail (2.3693.20.0.1.32) X-Rspamd-Queue-Id: 4Hwzhd6sZQz3Dnq X-Spamd-Bar: ++ Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of cross@distal.com designates 168.235.105.136 as permitted sender) smtp.mailfrom=cross@distal.com X-Spamd-Result: default: False [2.91 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; MV_CASE(0.50)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; R_SPF_ALLOW(-0.20)[+a:relay.dynu.com]; DMARC_NA(0.00)[distal.com]; NEURAL_SPAM_MEDIUM(1.00)[1.000]; NEURAL_SPAM_SHORT(0.71)[0.707]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:3842, ipnet:168.235.104.0/22, country:US]; TAGGED_FROM(0.00)[freebsd]; RCVD_TLS_ALL(0.00)[]; RECEIVED_SPAMHAUS_PBL(0.00)[108.48.165.176:received] X-ThisMailContainsUnwantedMimeParts: N (Sorry that the subject on this thread may not be relevant any more, but = I don=E2=80=99t want to disconnect the thread.) > On Nov 15, 2021, at 13:17, Chris Ross = wrote: >> On Nov 15, 2021, at 10:08, Andriy Gapon wrote: >=20 >> Yes, I propose to remove the wait for ARC evictions from = arc_lowmem(). >>=20 >> Another thing that may help a bit is having a greater "slack" between = a threshold where the page daemon starts paging out and a threshold = where memory allocations start to wait (via vm_wait_domain). >>=20 >> Also, I think that for a long time we had a problem (but not sure if = it's still present) where allocations succeeded without waiting until = the free memory went below certain threshold M, but once a thread = started waiting in vm_wait it would not be woken up until the free = memory went above another threshold N. And the problem was that N >> M. = In other words, a lot of memory had to be freed (and not grabbed by = other threads) before the waiting thread would be woken up. >=20 > Thank you both for your inputs. Let me know if you=E2=80=99d like me = to try anything, and I=E2=80=99ll kick (reboot) the system and can build = a new kernel when you=E2=80=99d like. I did get another procstat -kka = out of it this morning, and the system has since gone less responsive, = but I assume that new procstat won=E2=80=99t show anything last = night=E2=80=99s didn=E2=80=99t. I=E2=80=99m still having this issue. I rebooted the machine, fsck=E2=80=99= d the disks, and got it running again. Again, it ran for ~50 hours = before getting stuck. I got another procstat-kka off of it, let me know = if you=E2=80=99d like a copy of it. But, it looks like the active = processes are all in arc_wait_for_eviction. A pagedaemon is in a = arc_wait_for_eviction under a arc_lowmem, but the python processes that = were doing the real work don=E2=80=99t have arc_lowmem in their stacks, = just the arc_wait_for_eviction. Please let me know if there=E2=80=99s anything I can do to assist in = finding a remedy for this. Thank you. - Chris