Re: poudriere bulk with ZFS and USE_TMPFS=no on main [14-ALPHA2 based]: extensive vlruwk for cpdup's on new builders after pkg builds in first builder
- Reply: Mark Millard : "Re: poudriere bulk with ZFS and USE_TMPFS=no on main [14-ALPHA2 based]: extensive vlruwk for cpdup's on new builders after pkg builds in first builder"
- In reply to: Mark Millard : "Re: poudriere bulk with ZFS and USE_TMPFS=no on main [14-ALPHA2 based]: extensive vlruwk for cpdup's on new builders after pkg builds in first builder"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 24 Aug 2023 05:54:33 UTC
On 8/24/23, Mark Millard <marklmi@yahoo.com> wrote:
> On Aug 23, 2023, at 15:10, Mateusz Guzik <mjguzik@gmail.com> wrote:
>
>> On 8/23/23, Mark Millard <marklmi@yahoo.com> wrote:
>>> [Forked off the ZFS deadlock 14 discussion, per feedback.]
>>>
>>> On Aug 23, 2023, at 11:40, Alexander Motin <mav@FreeBSD.org> wrote:
>>>
>>>> On 22.08.2023 14:24, Mark Millard wrote:
>>>>> Alexander Motin <mav_at_FreeBSD.org> wrote on
>>>>> Date: Tue, 22 Aug 2023 16:18:12 UTC :
>>>>>> I am waiting for final test results from George Wilson and then will
>>>>>> request quick merge of both to zfs-2.2-release branch. Unfortunately
>>>>>> there are still not many reviewers for the PR, since the code is not
>>>>>> trivial, but at least with the test reports Brian Behlendorf and Mark
>>>>>> Maybee seem to be OK to merge the two PRs into 2.2. If somebody else
>>>>>> have tested and/or reviewed the PR, you may comment on it.
>>>>> I had written to the list that when I tried to test the system
>>>>> doing poudriere builds (initially with your patches) using
>>>>> USE_TMPFS=no so that zfs had to deal with all the file I/O, I
>>>>> instead got only one builder that ended up active, the others
>>>>> never reaching "Builder started":
>>>>
>>>>> Top was showing lots of "vlruwk" for the cpdup's. For example:
>>>>> . . .
>>>>> 362 0 root 40 0 27076Ki 13776Ki CPU19 19 4:23
>>>>> 0.00% cpdup -i0 -o ref 32
>>>>> 349 0 root 53 0 27076Ki 13776Ki vlruwk 22 4:20
>>>>> 0.01% cpdup -i0 -o ref 31
>>>>> 328 0 root 68 0 27076Ki 13804Ki vlruwk 8 4:30
>>>>> 0.01% cpdup -i0 -o ref 30
>>>>> 304 0 root 37 0 27076Ki 13792Ki vlruwk 6 4:18
>>>>> 0.01% cpdup -i0 -o ref 29
>>>>> 282 0 root 42 0 33220Ki 13956Ki vlruwk 8 4:33
>>>>> 0.01% cpdup -i0 -o ref 28
>>>>> 242 0 root 56 0 27076Ki 13796Ki vlruwk 4 4:28
>>>>> 0.00% cpdup -i0 -o ref 27
>>>>> . . .
>>>>> But those processes did show CPU?? on occasion, as well as
>>>>> *vnode less often. None of the cpdup's was stuck in
>>>>> Removing your patches did not change the behavior.
>>>>
>>>> Mark, to me "vlruwk" looks like a limit on number of vnodes. I was not
>>>> deep in that area at least recently, so somebody with more experience
>>>> there could try to diagnose it. At very least it does not look related
>>>> to
>>>> the ZIL issue discussed in this thread, at least with the information
>>>> provided, so I am not surprised that the mentioned patches do not
>>>> affect
>>>> it.
>>>
>>> I did the above intending to test the deadlock in my context but
>>> ended up not getting that far when I tried to make zfs handle all
>>> the file I/O (USE_TMPFS=no and no other use of tmpfs or the like).
>>>
>>> The zfs context is a simple single partition on the boot media. I
>>> use ZFS for bectl BE use, not for other typical reasons. The media
>>> here is PCIe Optane 1.4T media. The machine is a ThreadRipper
>>> 1950X, so first generation. 128 GiBytes of RAM. 491520 MiBytes of
>>> swap, also on that Optane.
>>>
>>> # uname -apKU
>>> FreeBSD amd64-ZFS 14.0-ALPHA2 FreeBSD 14.0-ALPHA2 amd64 1400096 #112
>>> main-n264912-b1d3e2b77155-dirty: Sun Aug 20 10:01:48 PDT 2023
>>> root@amd64-ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG
>>> amd64 amd64 1400096 1400096
>>>
>>> The GENERIC-DBG variant of the kernel did not report any issues in
>>> earlier testing.
>>>
>>> The alter referenced /usr/obj/DESTDIRs/main-amd64-poud-bulk_a was
>>> installed from the same build.
>>>
>>> # zfs list
>>> NAME USED AVAIL REFER
>>> MOUNTPOINT
>>> zoptb 79.9G 765G 96K /zoptb
>>> zoptb/BUILDs 20.5G 765G 8.29M
>>> /usr/obj/BUILDs
>>> zoptb/BUILDs/alt-main-amd64-dbg-clang-alt 1.86M 765G 1.86M
>>> /usr/obj/BUILDs/alt-main-amd64-dbg-clang-alt
>>> zoptb/BUILDs/alt-main-amd64-nodbg-clang-alt 30.2M 765G 30.2M
>>> /usr/obj/BUILDs/alt-main-amd64-nodbg-clang-alt
>>> zoptb/BUILDs/main-amd64-dbg-clang 9.96G 765G 9.96G
>>> /usr/obj/BUILDs/main-amd64-dbg-clang
>>> zoptb/BUILDs/main-amd64-dbg-gccxtc 38.5M 765G 38.5M
>>> /usr/obj/BUILDs/main-amd64-dbg-gccxtc
>>> zoptb/BUILDs/main-amd64-nodbg-clang 10.3G 765G 10.3G
>>> /usr/obj/BUILDs/main-amd64-nodbg-clang
>>> zoptb/BUILDs/main-amd64-nodbg-clang-alt 37.2M 765G 37.2M
>>> /usr/obj/BUILDs/main-amd64-nodbg-clang-alt
>>> zoptb/BUILDs/main-amd64-nodbg-gccxtc 94.6M 765G 94.6M
>>> /usr/obj/BUILDs/main-amd64-nodbg-gccxtc
>>> zoptb/DESTDIRs 4.33G 765G 104K
>>> /usr/obj/DESTDIRs
>>> zoptb/DESTDIRs/main-amd64-poud 2.16G 765G 2.16G
>>> /usr/obj/DESTDIRs/main-amd64-poud
>>> zoptb/DESTDIRs/main-amd64-poud-bulk_a 2.16G 765G 2.16G
>>> /usr/obj/DESTDIRs/main-amd64-poud-bulk_a
>>> zoptb/ROOT 13.1G 765G 96K none
>>> zoptb/ROOT/build_area_for-main-amd64 5.03G 765G 3.24G none
>>> zoptb/ROOT/main-amd64 8.04G 765G 3.23G none
>>> zoptb/poudriere 6.58G 765G 112K
>>> /usr/local/poudriere
>>> zoptb/poudriere/data 6.58G 765G 128K
>>> /usr/local/poudriere/data
>>> zoptb/poudriere/data/.m 112K 765G 112K
>>> /usr/local/poudriere/data/.m
>>> zoptb/poudriere/data/cache 17.4M 765G 17.4M
>>> /usr/local/poudriere/data/cache
>>> zoptb/poudriere/data/images 96K 765G 96K
>>> /usr/local/poudriere/data/images
>>> zoptb/poudriere/data/logs 2.72G 765G 2.72G
>>> /usr/local/poudriere/data/logs
>>> zoptb/poudriere/data/packages 3.84G 765G 3.84G
>>> /usr/local/poudriere/data/packages
>>> zoptb/poudriere/data/wrkdirs 112K 765G 112K
>>> /usr/local/poudriere/data/wrkdirs
>>> zoptb/poudriere/jails 96K 765G 96K
>>> /usr/local/poudriere/jails
>>> zoptb/poudriere/ports 96K 765G 96K
>>> /usr/local/poudriere/ports
>>> zoptb/tmp 68.5M 765G 68.5M /tmp
>>> zoptb/usr 35.1G 765G 96K /usr
>>> zoptb/usr/13_0R-src 2.64G 765G 2.64G
>>> /usr/13_0R-src
>>> zoptb/usr/alt-main-src 96K 765G 96K
>>> /usr/alt-main-src
>>> zoptb/usr/home 181M 765G 181M
>>> /usr/home
>>> zoptb/usr/local 5.08G 765G 5.08G
>>> /usr/local
>>> zoptb/usr/main-src 833M 765G 833M
>>> /usr/main-src
>>> zoptb/usr/ports 26.4G 765G 26.4G
>>> /usr/ports
>>> zoptb/usr/src 96K 765G 96K
>>> /usr/src
>>> zoptb/var 52.6M 765G 96K /var
>>> zoptb/var/audit 356K 765G 356K
>>> /var/audit
>>> zoptb/var/crash 128K 765G 128K
>>> /var/crash
>>> zoptb/var/db 49.7M 765G 96K
>>> /var/db
>>> zoptb/var/db/pkg 49.4M 765G 49.4M
>>> /var/db/pkg
>>> zoptb/var/db/ports 164K 765G 164K
>>> /var/db/ports
>>> zoptb/var/log 1.61M 765G 1.61M
>>> /var/log
>>> zoptb/var/mail 632K 765G 632K
>>> /var/mail
>>> zoptb/var/tmp 128K 765G 128K
>>> /var/tmp
>>>
>>> # poudriere jail -jmain-amd64-bulk_a -i
>>> Jail name: main-amd64-bulk_a
>>> Jail version: 14.0-ALPHA2
>>> Jail arch: amd64
>>> Jail method: null
>>> Jail mount: /usr/obj/DESTDIRs/main-amd64-poud-bulk_a
>>> Jail fs:
>>> Jail updated: 2021-12-04 14:55:22
>>> Jail pkgbase: disabled
>>>
>>>
>>>
>>> So, setting up another test with some related information
>>> shown before, during, and after. sysctl output is from
>>> another ssh session than the bulk -a run.
>>>
>>> # sysctl -a | grep vnode
>>> kern.maxvnodes: 2213808
>>> kern.ipc.umtx_vnode_persistent: 0
>>> kern.minvnodes: 553452
>>> vm.vnode_pbufs: 2048
>>> vm.stats.vm.v_vnodepgsout: 0
>>> vm.stats.vm.v_vnodepgsin: 272429
>>> vm.stats.vm.v_vnodeout: 0
>>> vm.stats.vm.v_vnodein: 12461
>>> vfs.vnode_alloc_sleeps: 0
>>> vfs.wantfreevnodes: 553452
>>> vfs.freevnodes: 962766
>>> vfs.vnodes_created: 2538980
>>> vfs.numvnodes: 1056233
>>> vfs.cache.debug.vnodes_cel_3_failures: 0
>>> vfs.cache.stats.heldvnodes: 91878
>>> debug.vnode_domainset: <NULL>
>>> debug.sizeof.vnode: 448
>>> debug.fail_point.status_fill_kinfo_vnode__random_path: off
>>> debug.fail_point.fill_kinfo_vnode__random_path: off
>>>
>>> # poudriere bulk -jmain-amd64-bulk_a -a
>>> . . .
>>> [00:01:34] Building 34042 packages using up to 32 builders
>>> [00:01:34] Hit CTRL+t at any time to see build progress and stats
>>> [00:01:34] [01] [00:00:00] Builder starting
>>> [00:01:57] [01] [00:00:23] Builder started
>>> [00:01:57] [01] [00:00:00] Building ports-mgmt/pkg | pkg-1.20.4
>>> [00:03:09] [01] [00:01:12] Finished ports-mgmt/pkg | pkg-1.20.4: Success
>>> [00:03:22] [01] [00:00:00] Building print/indexinfo | indexinfo-0.3.1
>>> [00:03:22] [02] [00:00:00] Builder starting
>>> [00:03:22] [03] [00:00:00] Builder starting
>>> . . .
>>> [00:03:22] [31] [00:00:00] Builder starting
>>> [00:03:22] [32] [00:00:00] Builder starting
>>> [00:03:31] [01] [00:00:09] Finished print/indexinfo | indexinfo-0.3.1:
>>> Success
>>> [00:03:31] [01] [00:00:00] Building devel/gettext-runtime |
>>> gettext-runtime-0.22
>>> . . .
>>>
>>> Note that only [01] makes progress: no new "Builder started"
>>> notices occur. top shows 31 of the pattern:
>>> cpdup -i0 -o ref ??
>>>
>>> Then during the 31 cpudup's showing vlruwk most of the time:
>>>
>>> # sysctl -a | grep vnode
>>> kern.maxvnodes: 2213808
>>> kern.ipc.umtx_vnode_persistent: 0
>>> kern.minvnodes: 553452
>>> vm.vnode_pbufs: 2048
>>> vm.stats.vm.v_vnodepgsout: 22844
>>> vm.stats.vm.v_vnodepgsin: 582398
>>> vm.stats.vm.v_vnodeout: 890
>>> vm.stats.vm.v_vnodein: 34296
>>> vfs.vnode_alloc_sleeps: 2994
>>> vfs.wantfreevnodes: 553452
>>> vfs.freevnodes: 2209662
>>> vfs.vnodes_created: 12206299
>>> vfs.numvnodes: 2214071
>>> vfs.cache.debug.vnodes_cel_3_failures: 0
>>> vfs.cache.stats.heldvnodes: 459
>>> debug.vnode_domainset: <NULL>
>>> debug.sizeof.vnode: 448
>>> debug.fail_point.status_fill_kinfo_vnode__random_path: off
>>> debug.fail_point.fill_kinfo_vnode__random_path: off
>>>
>>> Wait a while but still the mostly cpdup vlruwk status:
>>>
>>> # sysctl -a | grep vnode
>>> kern.maxvnodes: 2213808
>>> kern.ipc.umtx_vnode_persistent: 0
>>> kern.minvnodes: 553452
>>> vm.vnode_pbufs: 2048
>>> vm.stats.vm.v_vnodepgsout: 22844
>>> vm.stats.vm.v_vnodepgsin: 583527
>>> vm.stats.vm.v_vnodeout: 890
>>> vm.stats.vm.v_vnodein: 34396
>>> vfs.vnode_alloc_sleeps: 8053
>>> vfs.wantfreevnodes: 553452
>>> vfs.freevnodes: 2210166
>>> vfs.vnodes_created: 12212061
>>> vfs.numvnodes: 2215106
>>> vfs.cache.debug.vnodes_cel_3_failures: 0
>>> vfs.cache.stats.heldvnodes: 497
>>> debug.vnode_domainset: <NULL>
>>> debug.sizeof.vnode: 448
>>> debug.fail_point.status_fill_kinfo_vnode__random_path: off
>>> debug.fail_point.fill_kinfo_vnode__random_path: off
>>>
>>> ^C[00:14:55] Error: Signal SIGINT caught, cleaning up and exiting
>>>
>>> # sysctl -a | grep vnode
>>> kern.maxvnodes: 2213808
>>> kern.ipc.umtx_vnode_persistent: 0
>>> kern.minvnodes: 553452
>>> vm.vnode_pbufs: 2048
>>> vm.stats.vm.v_vnodepgsout: 22844
>>> vm.stats.vm.v_vnodepgsin: 584474
>>> vm.stats.vm.v_vnodeout: 890
>>> vm.stats.vm.v_vnodein: 34591
>>> vfs.vnode_alloc_sleeps: 17584
>>> vfs.wantfreevnodes: 553452
>>> vfs.freevnodes: 2210796
>>> vfs.vnodes_created: 12222343
>>> vfs.numvnodes: 2216564
>>> vfs.cache.debug.vnodes_cel_3_failures: 0
>>> vfs.cache.stats.heldvnodes: 539
>>> debug.vnode_domainset: <NULL>
>>> debug.sizeof.vnode: 448
>>> debug.fail_point.status_fill_kinfo_vnode__random_path: off
>>> debug.fail_point.fill_kinfo_vnode__random_path: off
>>>
>>> [main-amd64-bulk_a-default] [2023-08-23_13h58m08s] [sigint:] Queued:
>>> 34435
>>> Built: 2 Failed: 0 Skipped: 35 Ignored: 358 Fetched: 0
>>> Tobuild: 34040 Time: 00:14:36
>>> [00:16:13] Logs:
>>> /usr/local/poudriere/data/logs/bulk/main-amd64-bulk_a-default/2023-08-23_13h58m08s
>>> [00:16:49] Cleaning up
>>> load: 5.28 cmd: sh 77057 [vlruwk] 141.63r 0.00u 30.98s 28% 6932k
>>> #0 0xffffffff80b76ebb at mi_switch+0xbb
>>> #1 0xffffffff80bc960f at sleepq_timedwait+0x2f
>>> #2 0xffffffff80b76610 at _sleep+0x1d0
>>> #3 0xffffffff80c5b2dc at vn_alloc_hard+0x2ac
>>> #4 0xffffffff80c50a12 at getnewvnode_reserve+0x92
>>> #5 0xffffffff829afb12 at zfs_zget+0x22
>>> #6 0xffffffff8299ca8d at zfs_dirent_lookup+0x16d
>>> #7 0xffffffff8299cb5f at zfs_dirlook+0x7f
>>> #8 0xffffffff829ac410 at zfs_lookup+0x350
>>> #9 0xffffffff829a782a at zfs_freebsd_cachedlookup+0x6a
>>> #10 0xffffffff80c368ad at vfs_cache_lookup+0xad
>>> #11 0xffffffff80c3b6d8 at cache_fplookup_final_modifying+0x188
>>> #12 0xffffffff80c38766 at cache_fplookup+0x356
>>> #13 0xffffffff80c43fb2 at namei+0x112
>>> #14 0xffffffff80c62e5b at kern_funlinkat+0x13b
>>> #15 0xffffffff80c62d18 at sys_unlink+0x28
>>> #16 0xffffffff83b8e583 at filemon_wrapper_unlink+0x13
>>> #17 0xffffffff81049a79 at amd64_syscall+0x109
>>>
>>> [00:26:28] Unmounting file systems
>>> Exiting with status 1
>>>
>>> # sysctl -a | grep vnode
>>> kern.maxvnodes: 2213808
>>> kern.ipc.umtx_vnode_persistent: 0
>>> kern.minvnodes: 553452
>>> vm.vnode_pbufs: 2048
>>> vm.stats.vm.v_vnodepgsout: 22844
>>> vm.stats.vm.v_vnodepgsin: 585384
>>> vm.stats.vm.v_vnodeout: 890
>>> vm.stats.vm.v_vnodein: 34798
>>> vfs.vnode_alloc_sleeps: 27578
>>> vfs.wantfreevnodes: 553452
>>> vfs.freevnodes: 61362
>>> vfs.vnodes_created: 20135479
>>> vfs.numvnodes: 59860
>>> vfs.cache.debug.vnodes_cel_3_failures: 0
>>> vfs.cache.stats.heldvnodes: 208
>>> debug.vnode_domainset: <NULL>
>>> debug.sizeof.vnode: 448
>>> debug.fail_point.status_fill_kinfo_vnode__random_path: off
>>> debug.fail_point.fill_kinfo_vnode__random_path: off
>>>
>>>
>>> For reference (from after):
>>>
>>> # kldstat
>>> Id Refs Address Size Name
>>> 1 95 0xffffffff80200000 274b308 kernel
>>> 2 1 0xffffffff8294c000 5d5238 zfs.ko
>>> 3 1 0xffffffff82f22000 7718 cryptodev.ko
>>> 4 1 0xffffffff83b10000 3390 acpi_wmi.ko
>>> 5 1 0xffffffff83b14000 3220 intpm.ko
>>> 6 1 0xffffffff83b18000 2178 smbus.ko
>>> 7 1 0xffffffff83b1b000 2240 cpuctl.ko
>>> 8 1 0xffffffff83b1e000 3360 uhid.ko
>>> 9 1 0xffffffff83b22000 4364 ums.ko
>>> 10 1 0xffffffff83b27000 33c0 usbhid.ko
>>> 11 1 0xffffffff83b2b000 3380 hidbus.ko
>>> 12 1 0xffffffff83b2f000 4d20 ng_ubt.ko
>>> 13 6 0xffffffff83b34000 abb8 netgraph.ko
>>> 14 2 0xffffffff83b3f000 a250 ng_hci.ko
>>> 15 4 0xffffffff83b4a000 2670 ng_bluetooth.ko
>>> 16 1 0xffffffff83b4d000 83a0 uftdi.ko
>>> 17 1 0xffffffff83b56000 4e58 ucom.ko
>>> 18 1 0xffffffff83b5b000 3360 wmt.ko
>>> 19 1 0xffffffff83b5f000 e268 ng_l2cap.ko
>>> 20 1 0xffffffff83b6e000 1bf68 ng_btsocket.ko
>>> 21 1 0xffffffff83b8a000 38f8 ng_socket.ko
>>> 22 1 0xffffffff83b8e000 3250 filemon.ko
>>> 23 1 0xffffffff83b92000 4758 nullfs.ko
>>> 24 1 0xffffffff83b97000 73c0 linprocfs.ko
>>> 25 3 0xffffffff83b9f000 be70 linux_common.ko
>>> 26 1 0xffffffff83bab000 3558 fdescfs.ko
>>> 27 1 0xffffffff83baf000 31b20 linux.ko
>>> 28 1 0xffffffff83be1000 2ed40 linux64.ko
>>>
>>>
>>> Note that before the "Cleaning up" notice, the vfs.freevnodes
>>> shows as being around (for example) 2210796. But after
>>> "Exiting with status": 61362. vfs.vnodes_created has a
>>> similar staging of in the ball park of up to 12222343 but
>>> then the change to: 20135479. Similarly, vfs.numvnodes
>>> 2216564 -> 59860.
>>>
>>>
>>>
>>> Anything else I should gather and report as basic information?
>>>
>>
>> This is a known problem, but it is unclear if you should be running
>> into it in this setup.
>
> The changed fixed the issue: so I do run into the the issue
> for this setup. See below.
>
>> Can you try again but this time *revert*
>> 138a5dafba312ff39ce0eefdbe34de95519e600d, like so:
>> git revert 138a5dafba312ff39ce0eefdbe34de95519e600d
>>
>> may want to switch to a different branch first, for example: git
>> checkout -b vfstesting
>
> # git -C /usr/main-src/ diff sys/kern/vfs_subr.c
> diff --git a/sys/kern/vfs_subr.c b/sys/kern/vfs_subr.c
> index 0f3f00abfd4a..5dff556ac258 100644
> --- a/sys/kern/vfs_subr.c
> +++ b/sys/kern/vfs_subr.c
> @@ -3528,25 +3528,17 @@ vdbatch_process(struct vdbatch *vd)
> MPASS(curthread->td_pinned > 0);
> MPASS(vd->index == VDBATCH_SIZE);
> + mtx_lock(&vnode_list_mtx);
> critical_enter();
> - if (mtx_trylock(&vnode_list_mtx)) {
> - for (i = 0; i < VDBATCH_SIZE; i++) {
> - vp = vd->tab[i];
> - vd->tab[i] = NULL;
> - TAILQ_REMOVE(&vnode_list, vp, v_vnodelist);
> - TAILQ_INSERT_TAIL(&vnode_list, vp, v_vnodelist);
> - MPASS(vp->v_dbatchcpu != NOCPU);
> - vp->v_dbatchcpu = NOCPU;
> - }
> - mtx_unlock(&vnode_list_mtx);
> - } else {
> - for (i = 0; i < VDBATCH_SIZE; i++) {
> - vp = vd->tab[i];
> - vd->tab[i] = NULL;
> - MPASS(vp->v_dbatchcpu != NOCPU);
> - vp->v_dbatchcpu = NOCPU;
> - }
> + for (i = 0; i < VDBATCH_SIZE; i++) {
> + vp = vd->tab[i];
> + TAILQ_REMOVE(&vnode_list, vp, v_vnodelist);
> + TAILQ_INSERT_TAIL(&vnode_list, vp, v_vnodelist);
> + MPASS(vp->v_dbatchcpu != NOCPU);
> + vp->v_dbatchcpu = NOCPU;
> }
> + mtx_unlock(&vnode_list_mtx);
> + bzero(vd->tab, sizeof(vd->tab));
> vd->index = 0;
> critical_exit();
> }
>
> Still with:
>
> # grep USE_TMPFS= /usr/local/etc/poudriere.conf
> # EXAMPLE: USE_TMPFS="wrkdir data"
> #USE_TMPFS=all
> #USE_TMPFS="data"
> USE_TMPFS=no
>
>
> That allowed the other builders to eventually reach "Builder started"
> and later activity, [00:05:50] [27] [00:02:29] Builder started
> being the first non-[01] to do so, no vlruwk's observed in what
> I saw in top:
>
> # poudriere bulk -jmain-amd64-bulk_a -a
> . . .
> [00:01:34] Building 34042 packages using up to 32 builders
> [00:01:34] Hit CTRL+t at any time to see build progress and stats
> [00:01:34] [01] [00:00:00] Builder starting
> [00:01:57] [01] [00:00:23] Builder started
> [00:01:57] [01] [00:00:00] Building ports-mgmt/pkg | pkg-1.20.4
> [00:03:08] [01] [00:01:11] Finished ports-mgmt/pkg | pkg-1.20.4: Success
> [00:03:21] [01] [00:00:00] Building print/indexinfo | indexinfo-0.3.1
> [00:03:21] [02] [00:00:00] Builder starting
> [00:03:21] [03] [00:00:00] Builder starting
> [00:03:21] [04] [00:00:00] Builder starting
> [00:03:21] [05] [00:00:00] Builder starting
> [00:03:21] [06] [00:00:00] Builder starting
> [00:03:21] [07] [00:00:00] Builder starting
> [00:03:21] [08] [00:00:00] Builder starting
> [00:03:21] [09] [00:00:00] Builder starting
> [00:03:21] [10] [00:00:00] Builder starting
> [00:03:21] [11] [00:00:00] Builder starting
> [00:03:21] [12] [00:00:00] Builder starting
> [00:03:21] [13] [00:00:00] Builder starting
> [00:03:21] [14] [00:00:00] Builder starting
> [00:03:21] [15] [00:00:00] Builder starting
> [00:03:21] [16] [00:00:00] Builder starting
> [00:03:21] [17] [00:00:00] Builder starting
> [00:03:21] [18] [00:00:00] Builder starting
> [00:03:21] [19] [00:00:00] Builder starting
> [00:03:21] [20] [00:00:00] Builder starting
> [00:03:21] [21] [00:00:00] Builder starting
> [00:03:21] [22] [00:00:00] Builder starting
> [00:03:21] [23] [00:00:00] Builder starting
> [00:03:21] [24] [00:00:00] Builder starting
> [00:03:21] [25] [00:00:00] Builder starting
> [00:03:21] [26] [00:00:00] Builder starting
> [00:03:21] [27] [00:00:00] Builder starting
> [00:03:21] [28] [00:00:00] Builder starting
> [00:03:21] [29] [00:00:00] Builder starting
> [00:03:21] [30] [00:00:00] Builder starting
> [00:03:21] [31] [00:00:00] Builder starting
> [00:03:21] [32] [00:00:00] Builder starting
> [00:03:29] [01] [00:00:08] Finished print/indexinfo | indexinfo-0.3.1:
> Success
> [00:03:30] [01] [00:00:00] Building devel/gettext-runtime |
> gettext-runtime-0.22
> [00:04:49] [01] [00:01:19] Finished devel/gettext-runtime |
> gettext-runtime-0.22: Success
> [00:04:52] [01] [00:00:00] Building devel/gmake | gmake-4.3_2
> [00:05:20] [01] [00:00:28] Finished devel/gmake | gmake-4.3_2: Success
> [00:05:23] [01] [00:00:00] Building devel/libtextstyle | libtextstyle-0.22
> [00:05:50] [27] [00:02:29] Builder started
> [00:05:50] [27] [00:00:00] Building devel/npth | npth-1.6
> [00:05:50] [08] [00:02:29] Builder started
> [00:05:50] [08] [00:00:00] Building devel/boost-jam | boost-jam-1.82.0
> [00:05:51] [16] [00:02:30] Builder started
> [00:05:51] [16] [00:00:00] Building devel/libdaemon | libdaemon-0.14_1
> [00:05:51] [19] [00:02:30] Builder started
> [00:05:51] [19] [00:00:00] Building graphics/libpotrace | libpotrace-1.16
> [00:05:51] [14] [00:02:30] Builder started
> [00:05:51] [14] [00:00:00] Building sysutils/gnome_subr | gnome_subr-1.0
> [00:05:51] [17] [00:02:30] Builder started
> [00:05:51] [17] [00:00:00] Building misc/hicolor-icon-theme |
> hicolor-icon-theme-0.17
> [00:05:51] [26] [00:02:30] Builder started
> [00:05:51] [26] [00:00:00] Building devel/libinotify | libinotify-20211018
> [00:05:51] [12] [00:02:30] Builder started
> [00:05:51] [12] [00:00:00] Building devel/evdev-proto | evdev-proto-5.8
> [00:05:51] [29] [00:02:30] Builder started
> [00:05:51] [29] [00:00:00] Building devel/opencl | opencl-3.0.14
> [00:05:51] [05] [00:02:30] Builder started
> [00:05:51] [05] [00:00:00] Building devel/autoconf-switch |
> autoconf-switch-20220527
> [00:05:51] [11] [00:02:30] Builder started
> [00:05:51] [11] [00:00:00] Building textproc/html2text |
> html2text-1.3.2a_1,1
> [00:05:51] [25] [00:02:30] Builder started
> [00:05:51] [04] [00:02:30] Builder started
> [00:05:51] [25] [00:00:00] Building audio/speexdsp | speexdsp-1.2.1
> [00:05:51] [04] [00:00:00] Building converters/libiconv | libiconv-1.17
> [00:05:51] [22] [00:02:30] Builder started
> [00:05:51] [22] [00:00:00] Building dns/public_suffix_list |
> public_suffix_list-20230630
> [00:05:51] [20] [00:02:30] Builder started
> [00:05:51] [02] [00:02:30] Builder started
> [00:05:51] [24] [00:02:30] Builder started
> [00:05:51] [31] [00:02:30] Builder started
> [00:05:51] [20] [00:00:00] Building print/gsfonts | gsfonts-8.11_8
> . . .
>
> I had waited a very long time with no such progress for [02]..[32]
> for the prior code.
>
> Now testing for the zfs deadlock issue should be possible for
> this setup.
>
Thanks for testing, I wrote a fix:
https://people.freebsd.org/~mjg/vfs-recycle-fix.diff
Applies to *stock* kernel (as in without the revert).
--
Mateusz Guzik <mjguzik gmail.com>