From nobody Sat Jan 15 15:55:18 2022 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id B86081965AE5; Sat, 15 Jan 2022 15:55:27 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-qk1-x72a.google.com (mail-qk1-x72a.google.com [IPv6:2607:f8b0:4864:20::72a]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4JbjRz4f05z3npK; Sat, 15 Jan 2022 15:55:27 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-qk1-x72a.google.com with SMTP id c190so12500762qkg.9; Sat, 15 Jan 2022 07:55:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=KwM405kC3ch1HgHSuJTgf7MpXd/UWJ6HyONVOLyEz+0=; b=e15dq1JAaSAe0VD4sbku28877rYcp/Jn9jnOA0EElvVS0WsKp+gJrLMpfyPitYCGw+ waJv0VoqKpQ1sP+BoS44OSPGftZaA7U7WkMR/+hlUG6ImW20WQICHNK5zrnumgn5ZuMP 33btZ29Zg0z8cjIj8fVAOCSHT2+qTcqrV/vfwuoYGdrgv7mB3gQ3mPqa+oiB68WGLuVi 26eP8m8ncCMzCSJ5M+EApKdFQDkqLYRG+cFIIueHrjz93cVNlaSIiN+j55rS+Er9YqXV AAU/kN/UxS/d2aw20Hq6iNyk8HQ0+7vSfgwBAHGWTte1n4xeKYPmpXNhvs3fauKQKoLr 0PzA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=KwM405kC3ch1HgHSuJTgf7MpXd/UWJ6HyONVOLyEz+0=; b=EP0xuFKH9Zot8UyzyVcrpfwkqnHf1BlsTajJvmHJCe7gBG2wxTRzZL0PQWQr7pBIw9 Hl4XIlzTju4SU7HsRPRVD5bxFUQxgLNRExJ3recMYwVko7a5urCkMIBsT+iZsG+K0TgI Pv+v3FqXLxLjw8jG9HFPFjI5qGJRm8RXwBx+dFPZVTU+5bjbQCFt59cEX3y+AyNbxybu MQGf/HSfM+laJ/5OUXLhxKkdsvsRSSYZmD6u8kdUO/DC0WXJ9j9N3BBuQAbkcuslhi/z d2tU8Kref5KhPJuRUKqu5kbiJBH1VpuPQuACz4wHKjDZdx4yiXEmS3fOby8m4Gc0Thlk r3aQ== X-Gm-Message-State: AOAM5324rxzytRdB2u/md7I3H1S2v2K5bBI21gL7XJ+zAgMtywcKqE9B /rqu+xz3PvMp0lLOI3JpbzlD8FMfHmI= X-Google-Smtp-Source: ABdhPJy0vFF+yIXJafxYNh2l2YYJhQnPTDnpVjZ8XR8I+DJdaizs0xdKvIk1GsNHQxEmM2/Fa+jyew== X-Received: by 2002:ae9:ef51:: with SMTP id d78mr9832684qkg.198.1642262121481; Sat, 15 Jan 2022 07:55:21 -0800 (PST) Received: from nuc (198-84-189-58.cpe.teksavvy.com. [198.84.189.58]) by smtp.gmail.com with ESMTPSA id j186sm2061354qkf.43.2022.01.15.07.55.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 15 Jan 2022 07:55:20 -0800 (PST) Date: Sat, 15 Jan 2022 10:55:18 -0500 From: Mark Johnston To: Mark Millard Cc: freebsd-current , dev-commits-src-main@freebsd.org Subject: Re: git: 4a864f624a70 - main - vm_pageout: Print a more accurate message to the console before an OOM kill Message-ID: References: <1EF55D96-F7E3-4AA6-A331-782362A70878.ref@yahoo.com> <1EF55D96-F7E3-4AA6-A331-782362A70878@yahoo.com> List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1EF55D96-F7E3-4AA6-A331-782362A70878@yahoo.com> X-Rspamd-Queue-Id: 4JbjRz4f05z3npK X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-ThisMailContainsUnwantedMimeParts: N On Fri, Jan 14, 2022 at 09:38:56PM -0800, Mark Millard wrote: > Thanks. This will allow me to remove part of my personal additions > in this area --and my having to explain the misnomer when trying > to help someone analyze why they end up with OOM activity so they > can figure out what to do about it. > > There seem to be two separate sources of VM_OOM_SWAPZ. Showing > my personal additions for them (just making them explicit in the > sequence of messages generated): > > diff --git a/sys/vm/swap_pager.c b/sys/vm/swap_pager.c > index 01cf9233329f..280621ca51be 100644 > --- a/sys/vm/swap_pager.c > +++ b/sys/vm/swap_pager.c > @@ -2091,6 +2091,7 @@ swp_pager_meta_build(vm_object_t object, vm_pindex_t pindex, daddr_t swapblk) > 0, 1)) > printf("swap blk zone exhausted, " > "increase kern.maxswzone\n"); > + printf("swp_pager_meta_build: swap blk uma zone exhausted\n"); > vm_pageout_oom(VM_OOM_SWAPZ); > pause("swzonxb", 10); > } else > @@ -2121,6 +2122,7 @@ swp_pager_meta_build(vm_object_t object, vm_pindex_t pindex, daddr_t swapblk) > 0, 1)) > printf("swap pctrie zone exhausted, " > "increase kern.maxswzone\n"); > + printf("swp_pager_meta_build: swap pctrie uma zone exhausted\n"); > vm_pageout_oom(VM_OOM_SWAPZ); > pause("swzonxp", 10); > } else > > Care to comment on the distinctions and why there are two > contexts classified as "out of swap space"? Would either > one show the swap space as (nearly?) all used in, say, top? > Or might one of them still end up looking like a misnomer > from just a top (or whatever) display? Hmm, those cases should likely be changed from "out of swap space" to "failed to allocate swap metadata" or something like that. Running out of swap space is not itself a reason to trigger an OOM kill; if the page daemon can continue to reclaim clean pages while swap is full, then it'll do so without killing anything. If the swap devices are full and the only way to reclaim memory is by laundering dirty pages, then "failed to reclaim memory" is the message you'd likely see after this commit. The two cases which call vm_pageout_oom(VM_OOM_SWAPSZ) arise when the swap pager fails to allocate structures used to map physical pages to their location on a swap device. swap_pager_swap_init() pre-allocates these structures during boot, and the size of the reserves is based on the amount of physical memory. In particular, each VM object maintains a trie of "swap blocks", each of which maps a run of SWAP_META_PAGES pages contiguous within an object to individual blocks on a swap device. One zone provides internal nodes for the trie, while the other provides these swap blocks. Assuming perfect efficiency, the reserves provide enough memory to allow all of physical memory to be swapped out, I believe. In practice there can be external fragmentation of the page index space which leads to less than perfect utilization of these metadata structures, in which case it's possible to exhaust the reserves. This seems to be a fairly rare scenario though.