Re: swap_pager: cannot allocate bio

From: Chris Ross <cross+freebsd_at_distal.com>
Date: Fri, 12 Nov 2021 19:50:00 UTC

> On Nov 12, 2021, at 11:15, Warner Losh <imp@bsdimp.com> wrote:
> So the root cause of this problem is well known. You have a memory shortage, so you want to page out dirty pages to reclaim memory.
> However, there's not enough memory to allocate the structures you need to do I/O and so the swapout I/O fails half way down
> the stack not being able to allocate a bio. Some paths through the swapper cope with this well, other parts that execute less
> often cope less well.
> 
> There's some hacks in the tree today to help with the GELI case: we prioritize swapping I/O. But there's no g_alloc_bio_swapping() interface
> for swapping I/O to get priority on allocating a bio to start with. Places that use g_clone_bio() could have the clone's copy allocated
> from a special swap pool, but that starts to get messy and isn't done today. And the upper layers like geom_cfs and ZFS are
> inconsistent in allocations, so there's work needed to make it robust in ZFS, but I have only a vague notion of what's needed. At the very
> least, the swapping I/O that comes into the top of ZFS won't have swapping I/O marked coming out the bottom because the
> BIO_SWAP flag is quite new.
> 
> So until then, swapping on zvols is fraught with deadlocks like this and in the past there's been a strong admonishment
> against it.

Apologies, Warner, but I’m not sure I’m understanding this last statement.  If you mean swapping _onto_ zvols, I’m not doing that.  If you mean swapping in any way, while having zvols, then yes I am doing that.  

My swap is on a partition on the non-ZFS disk.  A physical disk as far as the kernel knows, hardware RAID1.

# pstat -s
Device           1K-blocks   Used    Avail  Capacity
/dev/da0p3      445682648 1018524 444664124 0%

Let me know if what you’re saying above is true to my case, and any advice as to how I can avoid it.  I had a “not enough swap space” a while back, and accordingly increased the size of my swap partition.  I have 128GB of memory, though between the ARC and the big process I was running, that fills it easily.

          - Chris