Re: An attempted test of main's "git: 2ad756a6bbb3" "merge openzfs/zfs@95f71c019" that did not go as planned

From: Mark Millard <marklmi_at_yahoo.com>
Date: Mon, 04 Sep 2023 07:13:51 UTC
On Sep 3, 2023, at 19:54, Mark Millard <marklmi@yahoo.com> wrote:

> ThreadRipper 1950X (32 hardware threads) doing bulk -J128
> with USE_TMPFS=no , no ALLOW_MAKE_JOBS , no
> ALLOW_MAKE_JOBS_PACKAGES , USB3 NVMe SSD storage/ZFS-boot-media,
> debug system build in use :
> 
> [00:03:44] Building 34214 packages using up to 128 builders
> [00:03:44] Hit CTRL+t at any time to see build progress and stats
> [00:03:44] [01] [00:00:00] Builder starting
> [00:04:37] [01] [00:00:53] Builder started
> [00:04:37] [01] [00:00:00] Building ports-mgmt/pkg | pkg-1.20.6
> [00:05:53] [01] [00:01:16] Finished ports-mgmt/pkg | pkg-1.20.6: Success
> [00:06:15] [01] [00:00:00] Building print/indexinfo | indexinfo-0.3.1
> [00:06:15] [02] [00:00:00] Builder starting
> . . .
> [00:06:18] [128] [00:00:00] Builder starting
> [00:07:42] [01] [00:01:27] Finished print/indexinfo | indexinfo-0.3.1: Success
> [00:07:45] [01] [00:00:00] Building devel/gettext-runtime | gettext-runtime-0.22_1
> [00:18:45] [01] [00:11:00] Finished devel/gettext-runtime | gettext-runtime-0.22_1: Success
> [00:19:06] [01] [00:00:00] Building devel/gmake | gmake-4.3_2
> [00:24:13] [01] [00:05:07] Finished devel/gmake | gmake-4.3_2: Success
> [00:24:39] [01] [00:00:00] Building devel/libtextstyle | libtextstyle-0.22
> [00:31:08] [125] [00:24:50] Builder started
> [00:31:08] [125] [00:00:00] Building print/t1utils | t1utils-1.32
> [00:31:15] [33] [00:25:00] Builder started
> [00:31:15] [81] [00:24:59] Builder started
> [00:31:15] [33] [00:00:00] Building databases/xapian-core | xapian-core-1.4.23,1
> [00:31:15] [13] [00:25:00] Builder started
> [00:31:15] [81] [00:00:00] Building devel/bmake | bmake-20230723
> [00:31:15] [13] [00:00:00] Building devel/evdev-proto | evdev-proto-5.8
> [00:31:16] [41] [00:25:00] Builder started
> [00:31:16] [41] [00:00:00] Building devel/pcre | pcre-8.45_3
> . . .
> 
> (Looks like lang/go120 ignores the lack of ALLOW_MAKE_JOBS .
> There may be others that still have signficant parallel
> activity.)
> 
> [main-amd64-bulk_a-default] [2023-09-03_13h48m45s] [parallel_build:] Queued: 34588 Built: 727   Failed: 1     Skipped: 40    Ignored: 335   Fetched: 0     Tobuild: 33485  Time: 01:36:51
> 
> (So about 1 hr after the last "Builder starting" it had
> built 727.)
> 
> The vast majority of the time: lots of cpdup's with tx->tx
> showing most of the time for STATE but showing having some
> CPU time.
> 
> ^T commonly showed various Builders in starting PHASE for
> 3min..6min.
> 
> Around 66% mean Idle time (guess from watching top).
> 
> After ^C "gstat -spod" reports it is almost always writing
> 2200 to 2500 writes per second or so for *hours* (still
> going on).
> 
> ztop reports 1500 to 3200 d/s or so almost always for
> Dataset zamd64/poudriere/data/.m instead (also still going
> on). Normally no other Dataset is shown.
> 
> With all the disk I/O activity, this is definitely "live"
> in some sense. But I've no clue if it is just repeating
> itself over and over vs. if it making some sort of progress.
> 
> For reference for the ^C and after:
> 
> ^C[01:39:00] [20] [00:00:03] Building sysutils/linux-c7-dosfstools | linux-c7-dosfstools-3.0.20
> [01:39:00] [93] [00:07:12] Finished science/dimod | dimod-0.12.11: Success
> [01:39:00] Error: Signal SIGINT caught, cleaning up and exiting
> [01:39:02] [63] [00:06:34] Finished archivers/unarj | unarj-2.65_2: Success
> [01:39:03] [128] [00:07:47] Finished sysutils/shuf | shuf-3.0: Success
> [01:39:04] [113] [00:07:06] Finished devel/bsddialog | bsddialog-0.4.1: Success
> [main-amd64-bulk_a-default] [2023-09-03_13h48m45s] [sigint:] Queued: 34588 Built: 752   Failed: 1     Skipped: 40    Ignored: 335   Fetched: 0     Tobuild: 33460  Time: 01:38:56
> [01:39:06] Logs: /usr/local/poudriere/data/logs/bulk/main-amd64-bulk_a-default/2023-09-03_13h48m45s
> [01:39:14] [12] [00:09:07] Finished archivers/rzip | rzip-2.1_1: Success
> [01:39:14] Cleaning up
> exit: cannot open ./var/run/49_nohang.pid: No such file or directory
> exit: cannot open ./var/run/87_nohang.pid: No such file or directory
> 
> After that ^t produced the likes of:
> 
> load: 6.39  cmd: sh 4849 [tx->tx_quiesce_done_cv] 10047.33r 0.51u 121.32s 1% 13004k
> #0 0xffffffff80b6f103 at mi_switch+0x173
> #1 0xffffffff80bc0f24 at sleepq_switch+0x104
> #2 0xffffffff80aec4c5 at _cv_wait+0x165
> #3 0xffffffff82aba365 at txg_wait_open+0xf5
> #4 0xffffffff82a11b81 at dmu_free_long_range+0x151
> #5 0xffffffff829a87d2 at zfs_rmnode+0x72
> #6 0xffffffff829b658d at zfs_freebsd_reclaim+0x3d
> #7 0xffffffff8113a495 at VOP_RECLAIM_APV+0x35
> #8 0xffffffff80c5a7d9 at vgonel+0x3a9
> #9 0xffffffff80c5af7f at vrecycle+0x3f
> #10 0xffffffff829b643e at zfs_freebsd_inactive+0x4e
> #11 0xffffffff80c598cf at vinactivef+0xbf
> #12 0xffffffff80c590da at vput_final+0x2aa
> #13 0xffffffff80c68886 at kern_funlinkat+0x2f6
> #14 0xffffffff80c68588 at sys_unlink+0x28
> #15 0xffffffff8106323f at amd64_syscall+0x14f
> #16 0xffffffff8103512b at fast_syscall_common+0xf8
> 
> The console/logs do report "witness exhausted":
> 
> . . .
> Sep  3 13:41:08 amd64-ZFS login[1751]: ROOT LOGIN (root) ON ttyv0
> Sep  3 13:51:35 amd64-ZFS kernel: witness_lock_list_get: witness exhausted
> Sep  3 14:26:38 amd64-ZFS kernel: pid 27418 (conftest), jid 245, uid 0: exited on signal 11 (core dumped)
> . . .
> 
> So it did not take long for the "witness exhausted" to
> happen.
> 
> # uname -apKU
> FreeBSD amd64-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT amd64 1500000 #74 main-n265143-525bc87f54f2-dirty: Sun Sep  3 13:35:04 PDT 2023     root@amd64_ZFS:/usr/obj/BUILDs/main-amd64-dbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-DBG amd64 amd64 1500000 1500000
> 
> 
> Looks like I'll be forcing the machine to reboot or
> to power off. The media was deliberately set up for
> doing risky tests. It is not my normal environment.
> 

Using -J64 instead of -J128. It does avoid "witness exhausted"
for at least the 1st hour.

[00:03:51] Building 34214 packages using up to 64 builders
[00:03:51] Hit CTRL+t at any time to see build progress and stats
[00:03:51] [01] [00:00:00] Builder starting
[00:04:49] [01] [00:00:58] Builder started
[00:04:49] [01] [00:00:00] Building ports-mgmt/pkg | pkg-1.20.6
[00:06:07] [01] [00:01:18] Finished ports-mgmt/pkg | pkg-1.20.6: Success
[00:06:31] [01] [00:00:00] Building print/indexinfo | indexinfo-0.3.1
[00:06:31] [02] [00:00:00] Builder starting
. . .
[00:06:33] [64] [00:00:00] Builder starting
[00:09:06] [01] [00:02:35] Finished print/indexinfo | indexinfo-0.3.1: Success
[00:09:08] [01] [00:00:00] Building devel/gettext-runtime | gettext-runtime-0.22_1
[00:21:49] [16] [00:15:18] Builder started
[00:21:49] [16] [00:00:00] Building devel/libdaemon | libdaemon-0.14_1
[00:21:49] [29] [00:15:17] Builder started
[00:21:49] [20] [00:15:18] Builder started
[00:21:49] [41] [00:15:17] Builder started
[00:21:49] [29] [00:00:00] Building textproc/libunibreak | libunibreak-5.1,1
[00:21:49] [20] [00:00:00] Building graphics/poppler-data | poppler-data-0.4.12
[00:21:49] [35] [00:15:17] Builder started
[00:21:49] [41] [00:00:00] Building archivers/libmspack | libmspack-0.11alpha
. . .
[main-amd64-bulk_a-default] [2023-09-03_20h48m38s] [parallel_build:] Queued: 34588 Built: 438   Failed: 1     Skipped: 50    Ignored: 335   Fetched: 0     Tobuild: 33764  Time: 01:21:30
. . .
^C[01:21:57] [32] [00:07:04] Finished devel/p5-Test-Deep | p5-Test-Deep-1.204: Success
[01:21:57] Error: Signal SIGINT caught, cleaning up and exiting
[01:22:03] [39] [00:06:01] Finished textproc/p5-Lingua-Stem-Ru | p5-Lingua-Stem-Ru-0.04: Success
[01:22:04] [35] [00:06:09] Finished devel/p5-ExtUtils-InstallPaths | p5-ExtUtils-InstallPaths-0.012: Success
[main-amd64-bulk_a-default] [2023-09-03_20h48m38s] [sigint:] Queued: 34588 Built: 442   Failed: 1     Skipped: 50    Ignored: 335   Fetched: 0     Tobuild: 33760  Time: 01:21:50
[01:22:04] Logs: /usr/local/poudriere/data/logs/bulk/main-amd64-bulk_a-default/2023-09-03_20h48m38s
[01:22:11] Cleaning up

(So only around 438 built in similar time frame relationship used for
the -J128 test that got to more like 752. But -J128 had the "witness
exhausted" status as well.)

Turns out a measure of progress is the USED listed by zfs list for
/usr/local/poudriere/data/.m/ . It spends lots of time waiting in
various processes during its deletion activity.

Previously I'd been told to use vfs.zfs.per_txg_dirty_frees_percent=5
instead of the default (30) in order to avoid ending up with sustained
very small load averages in my poudiere bulk runs for my kind of
context. (5 is the older default, as it turns out.) This may be
somewhat of a deletion/cleanup stage variant of that sort of issue.

It may be that trying a factor of 2+ for 32 hardware threads just
does not scale the same as a factor of 2 did for folks using 4
hardware thread machines where testing for the deadlock issue. May
be -J36 (so: 32+4) would be more reasonable for deadlock testing
for this context, possibly avoiding running into these other issues
so strongly.

===
Mark Millard
marklmi at yahoo.com