From nobody Fri Nov 12 19:50:00 2021 X-Original-To: freebsd-fs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id F03201859C5E for ; Fri, 12 Nov 2021 19:50:11 +0000 (UTC) (envelope-from cross+freebsd@distal.com) Received: from relay.wiredblade.com (relay.wiredblade.com [168.235.95.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4HrThM4r7mz3pkJ for ; Fri, 12 Nov 2021 19:50:11 +0000 (UTC) (envelope-from cross+freebsd@distal.com) Received: from mail.distal.com (pool-108-48-165-176.washdc.fios.verizon.net [108.48.165.176]) by relay.wiredblade.com with ESMTPSA (version=TLSv1.2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256) ; Fri, 12 Nov 2021 19:50:04 +0000 Received: from smtpclient.apple ( [2001:420:c0c4:1004::ab]) by tristain.distal.com (OpenSMTPD) with ESMTPSA id 65eabce2 (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256:NO); Fri, 12 Nov 2021 14:50:02 -0500 (EST) Content-Type: text/plain; charset=utf-8 List-Id: Filesystems List-Archive: https://lists.freebsd.org/archives/freebsd-fs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-fs@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 15.0 \(3693.20.0.1.32\)) Subject: Re: swap_pager: cannot allocate bio From: Chris Ross In-Reply-To: Date: Fri, 12 Nov 2021 14:50:00 -0500 Cc: Ronald Klop , freebsd-fs Content-Transfer-Encoding: quoted-printable Message-Id: <4008C512-31F1-4BE3-B674-A270CF674757@distal.com> References: <9FE99EEF-37C5-43D1-AC9D-17F3EDA19606@distal.com> <09989390-FED9-45A6-A866-4605D3766DFE@distal.com> <4E5511DF-B163-4928-9CC3-22755683999E@distal.com> <42006135.15.1636709757975@mailrelay> <7B41B7D7-0C74-4F87-A49C-A666DB970CC3@distal.com> To: Warner Losh X-Mailer: Apple Mail (2.3693.20.0.1.32) X-Rspamd-Queue-Id: 4HrThM4r7mz3pkJ X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_FROM(0.00)[freebsd] X-ThisMailContainsUnwantedMimeParts: N > On Nov 12, 2021, at 11:15, Warner Losh wrote: > So the root cause of this problem is well known. You have a memory = shortage, so you want to page out dirty pages to reclaim memory. > However, there's not enough memory to allocate the structures you need = to do I/O and so the swapout I/O fails half way down > the stack not being able to allocate a bio. Some paths through the = swapper cope with this well, other parts that execute less > often cope less well. >=20 > There's some hacks in the tree today to help with the GELI case: we = prioritize swapping I/O. But there's no g_alloc_bio_swapping() interface > for swapping I/O to get priority on allocating a bio to start with. = Places that use g_clone_bio() could have the clone's copy allocated > from a special swap pool, but that starts to get messy and isn't done = today. And the upper layers like geom_cfs and ZFS are > inconsistent in allocations, so there's work needed to make it robust = in ZFS, but I have only a vague notion of what's needed. At the very > least, the swapping I/O that comes into the top of ZFS won't have = swapping I/O marked coming out the bottom because the > BIO_SWAP flag is quite new. >=20 > So until then, swapping on zvols is fraught with deadlocks like this = and in the past there's been a strong admonishment > against it. Apologies, Warner, but I=E2=80=99m not sure I=E2=80=99m understanding = this last statement. If you mean swapping _onto_ zvols, I=E2=80=99m not = doing that. If you mean swapping in any way, while having zvols, then = yes I am doing that. =20 My swap is on a partition on the non-ZFS disk. A physical disk as far = as the kernel knows, hardware RAID1. # pstat -s Device 1K-blocks Used Avail Capacity /dev/da0p3 445682648 1018524 444664124 0% Let me know if what you=E2=80=99re saying above is true to my case, and = any advice as to how I can avoid it. I had a =E2=80=9Cnot enough swap = space=E2=80=9D a while back, and accordingly increased the size of my = swap partition. I have 128GB of memory, though between the ARC and the = big process I was running, that fills it easily. - Chris