Re: A panic by vm_pageout_scan_active activity, some details in case they might help
- In reply to: Mark Millard : "Re: A panic by vm_pageout_scan_active activity, some details in case they might help"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sun, 27 Jul 2025 23:23:32 UTC
On Jul 27, 2025, at 15:25, Mark Millard <marklmi@yahoo.com> wrote:
> On Jul 27, 2025, at 15:00, Mark Johnston <markj@freebsd.org> wrote:
>>
>> On Sun, Jul 27, 2025 at 02:26:29PM -0700, Mark Millard wrote:
>>> I tried a poudriere(-devel) bulk -Ca on the amd64 system that
>>> I have access to and a package build used up much of the
>>> RAM+SWAP == 704 GiBytes before a panic happened. Past examples
>>> OOM'd without panics, although I did not know the context until
>>> examining this crash dump.
>>
>> What is the panic string?
>
> The picture I took shows:
>
> Fatal Trap 12: page fault while in kernel mode
>
> # more /var/crash/info.4
> Dump header from device: /dev/gpt/OptBswp364
> Architecture: amd64
> Architecture Version: 2
> Dump Length: 20258381824
> Blocksize: 512
> Compression: none
> Dumptime: 2025-07-26 18:56:16 -0700
> Hostname: 7950X3D-ZFS
> Magic: FreeBSD Kernel Dump
> Version String: FreeBSD 15.0-CURRENT main-n278320-3a33e39edd48 GENERIC-NODEBUG
> Panic String: page fault
> Dump Parity: 668710208
> Bounds: 4
> Dump Status: good
>
>> Could you please open a report on bugzilla
>> and include the full core.txt.4?
>
> Okay.
Done:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=288507
>>> # uname -apKU
>>> FreeBSD 7950X3D-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT main-n278320-3a33e39edd48 GENERIC-NODEBUG amd64 amd64 1500048 1500048
>>>
>>> That is an official PkgBase installation of the boot-kernel and
>>> boot-world, not a personal build.
>>>
>>> The dump materials had references for doxygen and for dot to :
>>>
>>> /usr/local/poudriere/data/.m/main-ZNV4-bulk_a-alt/06/dev
>>>
>>> that let me track this to the [06] builder running at the time
>>> of the crash:
>>>
>>> [2D:01:22:29] [06] [00:00:00] Building graphics/sdl2_gpu | sdl2_gpu-0.12.0
>>>
>>> It was running doxygen, which in turn was running mulitple dot's.
>>>
>>> From /var/crash/core.txt.4 :
>>>
>>> UID PID PPID C PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND
>>> . . .
>>> 0 79229 40923 4 59 0 23524 4148 wait D - 0:00.00 [sh]
>>> 0 79230 79229 5 59 0 14208 172 wait Ds - 0:00.01 [make]
>>> 0 79233 79230 4 59 0 14668 176 wait D - 0:00.00 [sh]
>>> 0 79234 79233 5 59 0 14668 176 wait D - 0:00.00 [sh]
>>> 0 79235 79234 12 0 0 16284 356 select D - 0:00.01 [ninja]
>>> 0 79236 79235 28 59 0 223048 1052 uwait D - 0:00.44 [doxygen]
>>> 0 79272 79236 25 59 0 157589964 41424308 pfault D - 3:25.33 [dot]
>>> 0 79279 79236 31 59 0 157601740 41513520 pfault D - 3:23.41 [dot]
>>> 0 79289 79236 14 59 0 157589964 41361600 pfault D - 3:22.72 [dot]
>>> 0 79301 79236 18 49 0 157667276 41208476 pfault D - 3:24.32 [dot]
>>> . . .
>>>
>>> . . .
>>> #14 <signal handler called>
>>> No locals.
>>> #15 vm_pageout_scan_active (vmd=0xffffffff81c22380 <vm_dom>,
>>> page_shortage=102849)
>>> at /home/pkgbuild/worktrees/main/sys/vm/vm_pageout.c:1264
>>> ss = {bq = {bq_pa = {0xfffffe0030a1e500, 0xfffffe00a8798110,
>>> 0xfffffe00e3083e30, 0xfffffe00a47a4228, 0xfffffe002b6d8ef8,
>>> 0xfffffe0065cf29a0, 0xfffffe007a1b83b8, 0xfffffe008cf7b3c0,
>>> 0xfffffe005cd565e0, 0xfffffe0048ced5d8, 0xfffffe00c761d488,
>>> 0xfffffe008a5efe90, 0xfffffe00cf341738, 0xfffffe00413f97b8,
>>> 0xfffffe005270cc68, 0xfffffe00a5d9d690, 0xfffffe00294329e0,
>>> 0xfffffe005ef52f00, 0xfffffe0020dff308, 0xfffffe00ce1e9a40,
>>> 0xfffffe007ec47618, 0xfffffe005d1ba7e8, 0xfffffe0032d73470,
>>> 0xfffffe0030835e88, 0xfffffe009969c438, 0xfffffe00f151b0c8,
>>> 0xfffffe0063916fe8, 0xfffffe00dac0b778, 0xfffffe0016267348,
>>> 0xfffffe00b74a5fe0, 0xfffffe003434ef80, 0xfffffe009e31e840,
>>> 0xfffffe00530f6408, 0xfffffe00e0649508, 0xfffffe0102e87ad8,
>>> 0xfffffe0092c52848, 0xfffffe00ba829618, 0xfffffe008bf0fd10,
>>> 0xfffffe00550708c0, 0xfffffe00eedc67b8, 0xfffffe00d45f8210,
>>> 0xfffffe00b89a8698, 0xfffffe0082ffb310, 0xfffffe00accd53c0,
>>> 0xfffffe0091c8f5d8, 0xfffffe004e20f180, 0xfffffe004dfb4f90,
>>> 0xfffffe00a437fbb0, 0xfffffe00218cb698, 0xfffffe004ee5d278,
>>> 0xfffffe00a9e845a0, 0xfffffe0025d4a7c8, 0xfffffe0037612ac8,
>>> 0xfffffe005c7d3da8, 0xfffffe00d307c1b8, 0xfffffe00ee416538,
>>> 0xfffffe0043747508, 0xfffffe00ef30b508, 0xfffffe00c04de600,
>>> 0xfffffe008c0e3040, 0xfffffe0071a97b40, 0xfffffe005b644ad8,
>>> 0xfffffe00dd5da3b0}, bq_cnt = 39},
>>> pq = 0xffffffff81c22400 <vm_dom+128>,
>>> marker = 0xffffffff81c22778 <vm_dom+1016>, maxscan = 37165731,
>>> scanned = 15440544}
>>> marker = 0xffffffff81c22778 <vm_dom+1016>
>>> pq = 0xffffffff81c22400 <vm_dom+128>
>>> old = <optimized out>
>>> scan_tick = <optimized out>
>>> min_scan = <optimized out>
>>> m = 0xfffffe00eedc67b8
>>> object = 0x2b6c70f000
>>> refs = <optimized out>
>>> new = <optimized out>
>>> ps_delta = <optimized out>
>>> act_delta = <optimized out>
>>> max_scan = <optimized out>
>>> nqueue = <optimized out>
>>> _v = <optimized out>
>>> _tid = <optimized out>
>>> _v = <optimized out>
>>> _tid = <optimized out>
>>> _v = <optimized out>
>>> _v = <optimized out>
>>> _tid = <optimized out>
>>> _v = <optimized out>
>>> . . .
>>>
>>> From the /usr/src/sys/ for the PkgBase installation in use, there is in
>>> vm_pageout_scan_active :
>>>
>>> /home/pkgbuild/worktrees/main/sys/vm/vm_pageout.c: unmodified, readonly: line 1264 of 2416 [52%]
>>>
>>> /*
>>> * Check to see "how much" the page has been used.
>>> *
>>> * Test PGA_REFERENCED after calling pmap_ts_referenced() so
>>> * that a reference from a concurrently destroyed mapping is
>>> * observed here and now.
>>> *
>>> * Perform an unsynchronized object ref count check. While
>>> * the page lock ensures that the page is not reallocated to
>>> * another object, in particular, one with unmanaged mappings
>>> * that cannot support pmap_ts_referenced(), two races are,
>>> * nonetheless, possible:
>>> * 1) The count was transitioning to zero, but we saw a non-
>>> * zero value. pmap_ts_referenced() will return zero
>>> * because the page is not mapped.
>>> * 2) The count was transitioning to one, but we saw zero.
>>> * This race delays the detection of a new reference. At
>>> * worst, we will deactivate and reactivate the page.
>>> */
>>> refs = object->ref_count != 0 ? pmap_ts_referenced(m) : 0;
>>>
>>> I am unlikely to be able to replicate the panic.
>>>
>>> I hope that this is of some use.
>>>
>>> Note:
>>>
>>> I linked /home/pkgbuild/worktrees/main/sys to
>>> /usr/sys/src so that such paths work in my
>>> context.
>>
===
Mark Millard
marklmi at yahoo.com