Re: main 16 and 15.0-ALPHA4 [on amd64]: using a USB3 context gets extensive "flswai" [and "rename"] STATE time during poudriere builds (UFS context happens to be in use); more
Date: Sun, 05 Oct 2025 05:21:48 UTC
On Oct 2, 2025, at 21:43, Mark Millard <marklmi@yahoo.com> wrote:
> On Oct 2, 2025, at 20:45, Mark Millard <marklmi@yahoo.com> wrote:
>
>> On Sep 30, 2025, at 20:43, Mark Millard <marklmi@yahoo.com> wrote:
>>
>>> [The new material here ends up being about nameicap_cleanup
>>> and its exclusive use of mnt_renamelock being one potential
>>> bottleneck involved here. I make no claim it has anything to
>>> do with the flswai activity reported. The possible
>>> bottleneck is an observation, not something that I claim
>>> there is any alternative to. I do not know if this is of any
>>> interest or not.]
>>>
>>> On Sep 29, 2025, at 16:06, Mark Millard <marklmi@yahoo.com> wrote:
>>>
>>>> On Sep 29, 2025, at 13:01, Mark Millard <marklmi@yahoo.com> wrote:
>>>>
>>>>> An example is during the cpdup activities when multiple happen
>>>>> in overlappingtime frames:
>>>>
>>>> I'll note that I see this on the amd64 32-FreeBSD-cpu system
>>>> but not on the aarch64 8-FreeBSD-cpu Windows Dev Kit 2023
>>>> system. May be at some point I'll try the older 16-FreeBSD-cpu
>>>> aarch64 (Cortex-A72) system.
>>>>
>>>> Also, on the 7950X3D amd74 system, I see the behavior with
>>>> 14.3-Stable. Apparently, this is not new with 15+. It has
>>>> been a long time since I'd tried using an amd64 system for
>>>> such activity based on using USB3 media. But it has been
>>>> common for me for aarch64 over that time frame.
>>>>
>>>>> . . .
>>>>> . . .
>>>
>>> . . .
>>>
>>>
>>>>> . . .
>>>>> . . .
>>>>>
>>>>> But I'll also see such on c compiles, ld commands, etc. I've
>>>>> not seen rename for pkg-static but I have seen flswai for it.
>>>>>
>>>>> The system spends lots of time 95%+ idle from the wait
>>>>> activities.
>>>>>
>>>>> I see such directly booted from the USB3 media (a 15.0-ALPHA4
>>>>> context on UFS media) and when using that media via chroot
>>>>> from both ZFS and UFS boots that are not USB based. The ZFS
>>>>> and UFS boots do not show the behavior with the normal
>>>>> non-USB3 media used instead.
>>>>>
>>>>> The system in use is an AMD 7950X3D with 32 FreeBSD cpus,
>>>>> 192 GiBytes of RAM. main 16 booting for non-USB boots
>>>>> and 15.0-ALPHA4 boots for the USB3 boots. kernel and
>>>>> world are via official pkgbase distribution installs:
>>>>> it is not a personal build of the kernel or world.
>>>>>
>>>>>
>>>>> . . .
>>>>> . . .
>>
>> I got a test context were I could compare the same
>> media used on the same USB4 port on a laptop
>> (Dell Precision 5490, 22 FreeBSD cpus, 32 GiBytes
>> RAM, 4 USB4 ports), where, on boot, the media ends
>> up being handled as either:
>>
>> ) "nda0 at nvme0" (via involving a Thunderbolt 3 hub)
>> ) "da0" (via a direct connection)
>>
>> (The UEFI/ACPI does enough to make basic operation
>> work, presenting some view to FreeBSD for media on
>> USB4 ports.)
>>
>> Both have the bottlenecks visible when monitored with
>> top, both "flswai" and "rename" examples occur in
>> both contexts.
>>
>> But "nda0 at nvme0" bottleneck periods do not last
>> nearly as long as "da0" bottleneck periods do, making
>> "nda0 at nvme0" use much more reasonable for the
>> type of activity.
>>
>> Still, this eliminates the possibility that the issue
>> was limited to USB. It also eliminates it being
>> specific to the prior AMD (7950X3D) test context.
>>
>>
>> For reference:
>>
>> # uname -apKU
>> FreeBSD USB4sys 16.0-CURRENT FreeBSD 16.0-CURRENT main-n280801-213170eb956f GENERIC-NODEBUG amd64 amd64 1600001 1600001
>>
>> (It is from official pkgbase distribution use, a
>> copy of another boot media with some parameters
>> replaced afterwards.)
>>
>> QUOTE
>> CPU: Intel(R) Core(TM) Ultra 7 165H (3072.00-MHz K8-class CPU)
>> Origin="GenuineIntel" Id=0xa06a4 Family=0x6 Model=0xaa Stepping=4
>> . . .
>> WARNING: L3 data cache covers more APIC IDs than a package (6 > 3)
>> FreeBSD/SMP: Multiprocessor System Detected: 22 CPUs
>> FreeBSD/SMP: Non-uniform topology
>> END QUOTE
>>
>> (The internal NVMe media has Dell's ubuntu on it.)
>>
>>
>> Note:
>> Ignoring very old Intel MacBook Pro's and Mac Mini
>> 2018's, none of which I've ever native-booted FreeBSD
>> on, the Dell P. 5490 is the only Thunderbolt or USB4
>> based system that I've access to.
>
> I've discovered what is tied to the flswai/rename:
> Soft Updates had been enabled. Disabling Soft Updates
> leads to biowr and getblk as what shows for the
> bottleneck. (There is still a bottleneck.)
>
An FYI showing what can happen for the 32 FreeBSD cpu
USB3-based test context. The below occurred later,
after the initial bottleneck sequence when the jails
are being filling in via cpdup of the ref (for UFS
context via chrooting into it to then start the
build):
# uname -apKU
FreeBSD 7950X3D-ZFS 16.0-CURRENT FreeBSD 16.0-CURRENT main-n280910-85531add2844 GENERIC-NODEBUG amd64 amd64 1600001 1500066
[00:18:54] [official-amd64-default] [2025-10-04_21h23m47s] [parallel_build] Time: 00:17:58
Queued: 693 Inspected: 0 Ignored: 0 Built: 33 Failed: 0 Skipped: 0 Fetched: 0 Remaining: 660
ID TOTAL ORIGIN PKGNAME PHASE TIME TMPFS CPU% MEM%
[29] 00:08:55 databases/lmdb | lmdb-0.9.33,1 starting 00:08:55
[01] 00:02:53 graphics/libpotrace | libpotrace-1.16 starting 00:02:53
[15] 00:07:37 archivers/lzo2 | lzo2-2.10_1 starting 00:07:37
[30] 00:07:22 lang/lua53 | lua53-5.3.6_1 starting 00:07:22
[02] 00:07:51 devel/libmtdev | libmtdev-1.1.7 starting 00:07:51
[16] 00:07:35 lang/lua54 | lua54-5.4.8 starting 00:07:35
[31] 00:07:27 sysutils/libsunacl | libsunacl-1.0.1_1 starting 00:07:27
[03] 00:08:07 sysutils/dmidecode | dmidecode-3.6 starting 00:08:07
[17] 00:08:33 x11/xorgproto | xorgproto-2024.1 starting 00:08:33
[32] 00:07:38 print/indexinfo | indexinfo-0.3.1_1 starting 00:07:38
[04] 00:06:18 devel/libdatrie | libdatrie-0.2.13_2 starting 00:06:18
[18] 00:07:52 shells/bash-completion-freebsd | bash-completion-freebsd-1.4.0 starting 00:07:52
[05] 00:08:22 audio/libvorbis | libvorbis-1.3.7_2,3 starting 00:08:22
[19] 00:07:34 comms/iwmbt-firmware | iwmbt-firmware-20250410 starting 00:07:34
[06] 00:04:56 sysutils/sdparm | sdparm-1.12_1 starting 00:04:56
[20] 00:08:01 shells/ksh93 | ksh93-93.u_4,2 starting 00:08:01
[07] 00:08:02 textproc/sdocbook-xml | sdocbook-xml-1.1_2,2 starting 00:08:02
[21] 00:08:49 archivers/libmspack | libmspack-0.11alpha starting 00:08:49
[08] 00:08:23 benchmarks/iperf3 | iperf3-3.19.1 starting 00:08:23
[22] 00:08:01 textproc/xmlcharent | xmlcharent-0.3_2 starting 00:08:01
[09] 00:08:03 devel/pkgconf | pkgconf-2.4.3,1 starting 00:08:03
[23] 00:08:15 lang/perl5.42 | perl5-5.42.0_1 starting 00:08:15
[10] 00:08:16 devel/opencl | opencl-3.0.19 starting 00:08:16
[24] 00:08:11 benchmarks/bonnie | bonnie-2.0.6_2 starting 00:08:11
[11] 00:08:45 security/easy-rsa | easy-rsa-3.2.4,1 starting 00:08:45
[25] 00:07:48 audio/mpg123 | mpg123-1.33.2 starting 00:07:48
[12] 00:03:34 databases/sqlite3@default | sqlite3-3.50.2_1,1 starting 00:03:34
[26] 00:08:45 ports-mgmt/portconfig | portconfig-0.6.2_2 starting 00:08:45
[13] 00:07:54 textproc/expat2 | expat-2.7.3 starting 00:07:54
[27] 00:07:49 multimedia/v4l_compat | v4l_compat-1.23.0_7 starting 00:07:49
[14] 00:08:27 dns/public_suffix_list | public_suffix_list-20250828 starting 00:08:27
[28] 00:07:54 devel/libdwarf | libdwarf-20161124 starting 00:07:54
It does eventually make progress again after this.
The laptop 22 FreeBSD cpu context with external USB4
media being handled as "nda0 at nvme0" (via using a
Thunderbolt 3 hub between the laptop and the media)
has much better performance for builds. (No chroot
is involved here.)
# uname -apKU
FreeBSD USB4sys 16.0-CURRENT FreeBSD 16.0-CURRENT main-n280910-85531add2844 GENERIC-NODEBUG amd64 amd64 1600001 1600001
Shorter duration bottlenecking is still visible but
use wold be much more reasonable in this type of
context.
I''ll note that I did not notice any "rename" STATE
bottleneck time during this activity but did see
"flswai" STATE bottlenecking. (Soft Updates was
enabled.)
===
Mark Millard
marklmi at yahoo.com