Re: Swap, ZFS & ARC

From: Daniel Ebdrup Jensen <debdrup_at_FreeBSD.org>
Date: Sat, 04 Feb 2023 15:16:17 UTC
On Thu, Feb 02, 2023 at 02:28:48PM +0000, jbo@insane.engineer wrote:
>Hello folks,
>
>Based on a discussion on the forums not so long ago I tried to figure out how swap usage on a ZFS system plays together with ARC. However, I could find very little to no information on this which leads me to believe that there is some "core concept" I might be oblivious to.
>
>The main question is basically this: Your system starts to swap out data from RAM to your swap partition. This swap data on disk ultimately resides somewhere in a ZFS pool. If this data then gets accessed, it might be cached by ARC essentially eating up memory again which seems counter productive.
>Is there any magic which prevents swap partitions from being loaded into ARC? Or is this a non-issue for some other reason?
>
>Best regards,
>~ joel

Hi Joel,

      The catch-22 mentioned elsewhere in the thread isn't unique, it's a
      function of how virtual memory systems work when they have paging,
      even if they're designed to try and work around them (which, as
      others have pointed out, is how FreeBSD works).

      The only way around it that I know of is to adjust the VM
      watermarks that control when certain triggers happen, but
      unfortunately this process is workload dependent, so there's no
      real way of making recommendations short of "if you're still seeing
      the catch-22 after adjusting the values, adjust them more".

      The values that need to be adjusted are the following OIDs,
      using sysctl(8):
      vm.v_free_min, vm.v_free_target, vm.v_free_reserved,
      vm.v_inactive_target, and vm.v_free_severe.

      There is a pretty big downside to this, though - which is that it
      means a much bigger portion of your memory will be completely
      unused at any given point; which consequently mean your ARC is much
      less efficient, there's less room for more things in other caches,
      and you're still paying for the electricity to keep the memory
      state of the unused memory.
      There's also an issue you can run into if you ever have a panic(9)
      and it's caused by something in ZFS, because that'll mean that the
      core file can't be reliably expected to produce a meaningful
      backtrace.

      Ultimately, it's a question of what's easier - if you've setup your
      pool with no swap, and didn't reserve some space for future
      expansion and/or can't be arsed to do the zfs send | receive onto a
      new pool that has the space for it, the above method will at least
      let you get swap on ZFS without having to worry about the ZPL from
      using file extents (which is never a good idea, not even on UFS).

      As for your second question, the dataset zfsprops(7) parameter that
      you're looking for is primarycache, as it can be set to none.
      You'll also want to use org.freebsd:swap, turn off checksumming and
      compression, and disable sync.

Yours,
Daniel Ebdrup Jensen