[Bug 266013] Reported system memory decreases on large systems

From: <bugzilla-noreply_at_freebsd.org>
Date: Wed, 24 Aug 2022 08:37:14 UTC

            Bug ID: 266013
           Summary: Reported system memory decreases on large systems
           Product: Base System
           Version: CURRENT
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: thj@FreeBSD.org

Created attachment 236086
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=236086&action=edit
Plot of memory reported by system

On large database systems we have, the memory reported by top (as reported by 
the vm sysctls) decreases over time. This leads to the appearance that FreeBSD
is loosing pages.

We are experiencing this on a number of systems running FreeBSD 13, with
memories in the Terrabyte ranges (systems have 4TB, 3.2TB), and multiple NUMA
domains (2 and 4). Over time the missing pages increase until they account for
~50% of installed system memory with roughly 1.5TB missing on the 3.2TB

Over time the memory fields report by top (active, inactive, laundry, wired,
buf, free) stopping summing up to the amount of memory in the machine. 

Included is a plot showing the total memory that would be reported by top as a
percentage of the total pages the system has.

We have captured the output of `sysctl vm` on effected systems over long
periods, an hour with 1 second sampling (upper plot) and 24 hours with 1 minute
sampling (lower two plots).

While debugging this effect we patched the vm system to report the per domain
page_count. The upper plot in the figure shows this information for each of the
memory domains in that system as well.

These systems are part of a Galera MySQL cluster. When the Galera processes are
restarted the available memory in the system returns.

You are receiving this mail because:
You are the assignee for the bug.