[Bug 275594] High CPU usage by arc_prune; analysis and fix

From: <bugzilla-noreply_at_freebsd.org>
Date: Thu, 25 Jan 2024 03:50:23 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=275594

--- Comment #40 from Seigo Tanimura <seigo.tanimura@gmail.com> ---
(In reply to Seigo Tanimura from comment #38)

The results of the fixed stable/13 (13.3-PRERELEASE) branch is now ready to
share.

Thomas, could you please reproduce the build with this kernel and see if the
build time improves?  If so, I will work on merging the fix.  Thanks in
advance.

* Sources on GitHub:

The same as comment #38.

* Test results

Test Summary:

- Branch and commit: stable/13-topic-openzfs-arc_prune-regulation-counters,
ef898378041a1c67cd102e8e5eaca123a543029c
- Date: 24 Jan 2024 12:10Z - 24 Jan 2024 18:02Z
- Build time: 05:51:26 (363 pkgs / hr)
- Failed port(s): 2
- Skipped port(s): 2
- Setup
  - sysctl(3)
    - vfs.zfs.arc_max: 4294967296
      - 4GB.
    - vfs.zfs.arc.dnode_limit: 0 (default)
      - kstat.zfs.misc.arcstats.arc_dnode_limit: 322122547 (calculated
automatically)
  - poudriere-bulk(8)
    - USE_TMPFS="wrkdir data localbase"

Result Chart Archive: (poudriere-bulk-13_3_prerelease-2024-01-24_21h20m00s.7z,
Attachment #247941)

- zfs-znodes-and-dnodes.png
  - The counts of the ZFS znodes and dnodes.
- zfs-arc-pruning-regulation.png
  - The counts of the ARC prune triggers by ZFS and the skips by the fix.
- zfs-dnodes-and-freeing-activity.png
  - The freeing activity of the ZFS znodes and dnodes.
- vnode-free-calls.png
  - The calls to the ZFS vnode freeing functions.

* Findings and Analysis

- The ARC pruning has worked in the same way as 14.0-RELEASE.
  - The prunable znodes were pruned down to less than 10% of the dnodes.
  - The behaviour after 18:10Z was due to the nightly cron job started at
18:01Z.

- The build time was virtually the same as comment #38.
  - Also virtually the same as 14.0-RELEASE.

- The zfskern{arc_evict} thread used the CPU up to 100% in the final ~1 hour of
the build.
  - The reason is not clear.
  - There were no significant affects to the system.
  - zfskern{arc_evict} stopped running upon completing the build.

-- 
You are receiving this mail because:
You are the assignee for the bug.