From nobody Thu Oct 30 14:53:13 2025 X-Original-To: current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4cy6Yb2j1wz6DLf9 for ; Thu, 30 Oct 2025 14:53:23 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-qk1-x729.google.com (mail-qk1-x729.google.com [IPv6:2607:f8b0:4864:20::729]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4cy6YZ37xCz3c8x for ; Thu, 30 Oct 2025 14:53:22 +0000 (UTC) (envelope-from markjdb@gmail.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20230601 header.b=BFvEB28o; dmarc=fail reason="SPF not aligned (relaxed), DKIM not aligned (relaxed)" header.from=freebsd.org (policy=none); spf=pass (mx1.freebsd.org: domain of markjdb@gmail.com designates 2607:f8b0:4864:20::729 as permitted sender) smtp.mailfrom=markjdb@gmail.com Received: by mail-qk1-x729.google.com with SMTP id af79cd13be357-8906eb94264so134046985a.0 for ; Thu, 30 Oct 2025 07:53:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1761835996; x=1762440796; darn=freebsd.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:sender:from:to:cc:subject:date:message-id :reply-to; bh=I76nAh/lKByEyqYJok5MaaP+PLdTUUyCORZ5UqPOhsU=; b=BFvEB28o4fLoCUqvXLVkSaTft3/MoF3mA84OzKEXuJn+juKLt6lG+i0+/O2jGVf0vM a90CUGm22Ps8lxcGxIbKV0a5PQNpBzoCvjcEncOllItS/Ztn7AqBGbG8igQd/v4BOIOp NgjFEhm0EjGi3u1pRJKhvC4K4eeL54Tv+Z2X3aEQX8hc+61prVxgGCwDQPkn/CCIX/W3 FVUD8oK7L8ro9YK9IdqAO9gLtqEtJRoKHqx5YAquL4Bp7M5E18exKLO+Z9Syb2qWZjFL LH/VjIiFPxH/WoRKxTKxn/DmnpSIkW1EbCGpHyeiknZYc9kHeC4FcdFOvv8D1vSwH68U sIEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761835996; x=1762440796; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:sender:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=I76nAh/lKByEyqYJok5MaaP+PLdTUUyCORZ5UqPOhsU=; b=Sr9ShySqIv/N2eZ0ASabiYuXKPU4BOQzajonCfrbyzFM7TzWBntZg/QA39iWjq1ycV S1i858PHBkocDQyHstdmcACFZ2cwgZXD1RE1EB3TV0EJ9P6Q+5AvKENkk7xPvnIbEDYE PV/lAADxyMna68MtQVuNDI9V4BS3H9Dd/JlQFCiqvTsFY9L3tYiW4jPTMZk30LdApbWm HhHAlpr/SPXRFs2QmyQlqItYh0y5O1JXr2fiePlypD+RQ5M3J+YOxYldsyG6jqNkbUDI YALigT6cyc3gq0fju5egC3ji4FzzEtBeRrwZhGR8qe0f16fOSN05Ez7PS+PLTXmz0s45 tYrA== X-Gm-Message-State: AOJu0YzS93FuRXRCMrFrzgERvhuyJAIBAJOHYrwL/69xgodT0TSxTyMj h5fGvRaDT2OuXJ5aUApoo+RyPopUzY/Jx3tO85X8W2cCRwC0Vqgc/Ecn X-Gm-Gg: ASbGncu62EFZzkRMKypmaDBw0rSbeS10thFRmNdqpPfTtlLHMnxEYTSxDyFiQ5gZmja 6+lTXy+iN+RbyfzMJzcIbim+C51vdBKnodVEdFNP9ifmkl0wc3u84VHKLXuo4qBdgtkpL0v2IfQ cdkmuCgxlseGNjNjAL4+thl9VDJ4q4t3Jrko1sghaAa+tBuIjwR1q5Y4SfKFk68exd6w+PGHOTs SohFPfuDpisNSSUM1+tso8LmE9OJE4VhIX5ok82+q/qo92kxLDlMyEJt/ZdcVDVchyV0Adj7at5 qsnI3d5wdOG/OYp00Er9swnF+ZZExx6KF5j633ELju7RBcAxMxZU0oZZ2I+ppCa4u3cRtZQ2G2r nPapsQvgiA8XpB4gbJ//t9fOfysxpS5lUS2clujaQtdvQRGbwb/GyZAcjXpTSm0iPSHg29TirsB nLKIVslt9dlL7A4un3nA== X-Google-Smtp-Source: AGHT+IHXR3H8KKr/CGmRipXc+QSLwqCp3C5KIOafXqxKvVaWazBMq6n5Cs9r4HiKyUoZLYHatV/l5Q== X-Received: by 2002:a05:620a:298e:b0:84d:9f49:6898 with SMTP id af79cd13be357-8a8e5aaa544mr911156685a.61.1761835996428; Thu, 30 Oct 2025 07:53:16 -0700 (PDT) Received: from nuc (192-0-220-237.cpe.teksavvy.com. [192.0.220.237]) by smtp.gmail.com with ESMTPSA id af79cd13be357-89f254af19asm1252746685a.33.2025.10.30.07.53.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Oct 2025 07:53:16 -0700 (PDT) Date: Thu, 30 Oct 2025 10:53:13 -0400 From: Mark Johnston To: Andriy Gapon Cc: FreeBSD Current Subject: Re: limiting jail memory use with rctl/racct Message-ID: References: <386e11f1-0e28-4cef-9e6f-7469c7ae40a5@FreeBSD.org> List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <386e11f1-0e28-4cef-9e6f-7469c7ae40a5@FreeBSD.org> X-Spamd-Bar: -- X-Spamd-Result: default: False [-2.48 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.88)[-0.878]; MID_RHS_NOT_FQDN(0.50)[]; FORGED_SENDER(0.30)[markj@freebsd.org,markjdb@gmail.com]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36:c]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20230601]; MIME_GOOD(-0.10)[text/plain]; DMARC_POLICY_SOFTFAIL(0.10)[freebsd.org : SPF not aligned (relaxed), DKIM not aligned (relaxed),none]; RCPT_COUNT_TWO(0.00)[2]; ARC_NA(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; MIME_TRACE(0.00)[0:+]; TO_DN_ALL(0.00)[]; RCVD_TLS_LAST(0.00)[]; FROM_HAS_DN(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; FROM_NEQ_ENVFROM(0.00)[markj@freebsd.org,markjdb@gmail.com]; DKIM_TRACE(0.00)[gmail.com:+]; PREVIOUSLY_DELIVERED(0.00)[current@freebsd.org]; RCVD_VIA_SMTP_AUTH(0.00)[]; MLMMJ_DEST(0.00)[current@freebsd.org]; MISSING_XM_UA(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::729:from] X-Rspamd-Queue-Id: 4cy6YZ37xCz3c8x On Wed, Sep 24, 2025 at 02:08:11PM +0300, Andriy Gapon wrote: > > I wonder if people here use rctl to limit memory utilization for some > practical purposes and what your experience is. > > Recently I had a "bright" idea to limit memory use of Firefox (which, for > me, tends to consume all memory and swap impacting everything else on the > system). > Since Firefox is multi-process now, I decided to use a "null" jail as a > resource container. > That is, a jail configured with path=/ mount.nodevfs host=inherit ip4=inherit. > There is no filesystem or network isolation (so, no security benefits), just > grouping of related Firefox processes. > > The memory limit is set with this rule: > jail:firefox-cage:memoryuse:deny=8g > > I didn't know in advance how the memory limiting would affect Firefox and > how Firefox would react to it, so I decided to go ahead and experiment. > > I want to add that initially I also had a rule to limit swapuse but with it > enabled, Firefox wouldn't even start. When I removed the rule I observed > that initially rctl reported some absurdly high and unstable swapuse for the > jail. Gradually, it went down to some reasonable values. Maybe there is > some bug in RACCT code about accounting swap. I think the implementation is bogus. It hooks into swap_reserve_by_cred() so really it's limiting the amount of swap-backed virtual memory which can be allocated to the process. Each swap-backed VM object (typically corresponding to anonymous memory) has an associated user ID which is charged for the virtual mappings of that object. This can be seen by looking at RLIMIT_SWAP, e.g., on my desktop `procstat rlimitusage $(pgrep firefox) | grep swap` shows the same value for all processes. racct is hooking in at the wrong place. It also assumes that calls to swap_reserve_by_cred() and swap_release_by_cred() are balanced within a single process, which I think is not true. > For example: > $ rctl -h -u jail:firefox-cage: | sort > coredumpsize=0 > cputime=524 > datasize=276K > maxproc=23 > memorylocked=0 > memoryuse=8236M > msgqqueued=0 > msgqsize=0 > nmsgq=0 > nsem=0 > nsemop=0 > nshm=0 > nthr=559 > openfiles=5376 > pcpu=93 > pseudoterminals=0 > readbps=0 > readiops=0 > shmsize=0 > stacksize=8792K > swapuse=32G > vmemoryuse=73G > wallclock=3445 > writebps=288 > writeiops=2 > > One minute later: > $ rctl -h -u jail:firefox-cage: | sort > coredumpsize=0 > cputime=588 > datasize=312K > maxproc=26 > memorylocked=0 > memoryuse=8249M > msgqqueued=0 > msgqsize=0 > nmsgq=0 > nsem=0 > nsemop=0 > nshm=0 > nthr=633 > openfiles=5496 > pcpu=80 > pseudoterminals=0 > readbps=0 > readiops=0 > shmsize=0 > stacksize=10M > swapuse=19G > vmemoryuse=73G > wallclock=5140 > writebps=32K > writeiops=16 > > So, I had to ditch that rule although I find limiting memoryuse without > limiting swapuse to be incomplete. > > Also, I didn't even consider limiting vmemoryuse because it is very large, > it is hard to predict and it seems to have little correlation with the > physical memory use. I agree, limiting vmemoryuse is not very useful in general. It's just easy to implement. > Regarding the experiment, Firefox more or less works, but not without issues. > When there are a lot of sites are open in tabs, especially some "web > applications" that I have to use and which I know to be memory hogs, Firefox > start glitching here and there. Mostly it looks like some broken > JavaScript. > > Another observation is that memoryuse always stays somewhat above the 8 GB > limit. Sometimes it's just very slightly above, sometimes it's a couple of > hundred megs (or a few percent) above, e.g., memoryuse=8455M. > > And almost all the time I see a vmdaemon thread being active: > PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 16 root 119 - 0B 16K CPU4 4 57.1H 68.98% vmdaemon > > And it's always somewhere in this call chain: > procstat -kk 16 > PID TID COMM TDNAME KSTACK > 16 100177 vmdaemon - vm_swapout_object_deactivate+0x130 > vm_swapout_map_deactivate_pages+0x1f3 vm_daemon+0x87d fork_exit+0xc7 > fork_trampoline+0xe > > My impression is that vm_daemon is trying to inactivate some pages belonging > to processes in the jail, so that they could get swapped out. But either > they get reactivated or the pageout code does not see a need to swap them > out and they remain resident. I think the RACCT_RSS implementation doesn't work well in general. The vm_daemon loop periodically (1Hz) scans all processes in the system, and for each process updates the stored RSS and checks to see if a limit applying to the process is reached. If so, it picks some pages mapped into the process and tries to map them, so they don't count against the RSS anymore. But: its strategy for picking pages to unmap is totally unrelated to their usage, i.e., it may unmap frequently accessed pages, in which case they will be faulted back into the pmap very quickly. In that case, vmdaemon and firefox will constantly be fighting each other. > I'd say that this is kind of unexpected consequence. > It keeps a CPU core busy and doesn't allow the system to enter power saving states. > > To conclude, this has a been useful experiment for me. > Initially, I had some naive expectations that memory limiting would just > magically "limit memory". The experiment forced me to think about what it > actually means to limit memory, how it could be done, what consequences it > would have and in what cases it could be useful. > > If anyone has better suggestions and better experience, please let me know. I don't have any better suggestions. It would take a fair bit of work to improve racct such that it's able to limit memory usage the way you want.