Re: vminfo provider for FreeBSD

From: Peter Johnson <pete-fbsd_at_hiddenrock.com>
Date: Mon, 06 Jun 2022 16:21:42 UTC
Thanks for the detailed reply.

I'm coming at this from a bit of a weird angle: I use FreeBSD as the basis for
my operating systems course, part of which asks students to write programs
that exercise operating system functionalities like scheduling and virtual
memory in specific ways and then use DTrace to confirm that their programs are
doing what they're supposed to do vis-a-vis those functionalities (eg, "write
a program that induces swapping and a D script that proves it works").
Currently, the best way I've figured out for them to do this in the context of
virtual memory is to use vmstat(1), but it would be nice to have those
statistics at a process granularity.

The Illumos documentation on the vminfo provider itself suggests as a use case
getting more fine-grained information than the Illumos implementation of
vmstat makes available [1]---"more fine-grained" meaning both "per process
statistics" and, eg, "more information about individual faults".

Given the reservations you note/confirm (potential onerous overhead of SDT
probes in VM code, lack of clear mapping between Illumos probes and FreeBSD
codebase, scalability of arg1 especially wrt SMP systems, complexity of
per-domain NUMA stats) I propose the following as a first step:

    Implement probes that fire whenever values in the "page" category of
    vmstat(1) output change; that is: a page fault occurs (flt), a page is
    reactivated (re), a page is paged in (pi), a page is paged out (po), a
    page is freed (fr), a page is scanned by the page daemon (sr).

I am unfamiliar with the codebase, but it seems likely to me that all of those
use counter(9), and so we would be able to correctly populate arg1.

This would be a very modest amount of work (at least relative to transferring
the entire vmfino provider as it exists in Illumos) and give a starting point
for measuring SDT overhead.  

Once those proposed probes are in place, we can decide whether to implement
other probes from the Illumos set, add new probes that we determine useful,
optimize the SDT implementation, address SMP or NUMA considerations, etc.

Thoughts?

pete

[1] https://illumos.org/books/dtrace/chp-vminfo.html#chp-vminfo-3

On Fri, Jun 03, 2022 at 03:47:31PM -0400, Mark Johnston wrote:
> On Thu, Jun 02, 2022 at 01:08:27PM -0400, Peter Johnson wrote:
> > Hi there --
> > 
> > I would find the probes in Illumos' vminfo provider [1] really handy to have
> > in FreeBSD and I'm happy to do the work to make it happen.  The only
> > FreeBSD-related mention of the vminfo provider I can find is an old mailing
> > list post [2] that I interpret to mean that the existing fbt probes aren't a
> > meaningful alternative (not to mention that using fbt probes effectively
> > requires more understanding of the source code than is perhaps desirable given
> > DTrace's intended purpose/audience).
> > 
> > My first question is: would such an addition be welcome?  I can make a more
> > detailed case for its inclusion if that would be helpful/persuasive.
> 
> I think it'd be welcome.  My major reservation is that SDT probes have
> non-zero overhead even when disabled, especially on FreeBSD as currently
> implemented.  The vminfo provider effectively adds a probe to various VM
> counter increments, which can occur very very frequently in some
> workloads, so I think we'd also want to
> 1) try to measure that overhead, perhaps using some micro-benchmarks,
> 2) possibly use the results to help motivate some long-overdue
>    improvements to the SDT implementation.
> I'd be interested in helping with both of these.
> 
> It'd be helpful to see an example or two demonstrating how the vminfo
> provider would be useful in diagnosing a particular problem.
> 
> > If it is welcome, my plan would be to get very well-acquainted with FreeBSD's
> > VM subsystem, identify where each of the vminfo probes described in the
> > Illumos documentation should go, and then develop a patch to add those probes,
> > seeking feedback from both freebsd-dtrace folks and whichever group has
> > dominion over the VM stuff.
> > 
> > My second question is: does this sound like a reasonable plan?  It is,
> > admittedly, almost uselessly high level, but I expect I will need more than a
> > little familiarity with the codebase before I can get more specific.
> 
> Looking through the provider documentation, I suspect it'll be difficult
> to implement some of the probes on FreeBSD, as you note below.  For
> instance, I'm not sure that execfree can be implemented at all; FreeBSD
> doesn't have any (cheap) way to determine whether a given physical page
> belongs to an executable image.  At least, I can't think of one.
> 
> A second issue is in the description of "arg1" for vminfo probes.  In
> FreeBSD, frequently-updated counters are implemented using counter(9),
> which provides per-CPU counters.  To get the global value of such a
> counter, one must iterate over all per-CPU elements, summing them up.
> That's quite expensive and wasteful if you're doing it every time a
> vminfo probe fires.  I'm not sure how best to deal with that problem.
> 
> Yet another consideration is how one might expose per-NUMA domain
> counters.  We could simply ignore that consideration and just provide
> global values, but per-domain info can be very useful.
> 
> FreeBSD's VM system has a number of counters, exposed in various
> subtrees of the "vm" sysctl node.  One might start by looking at the
> existing counters to see how closely they match vminfo probes, or simply
> define FreeBSD's vminfo provider in terms of the existing counters,
> possibly adding new ones.
> 
> > Given the mailing list post I mentioned above, it seems possible that some of
> > the vminfo probes described in the Illumos documentation don't make sense in
> > the context of FreeBSD (eg, if FreeBSD doesn't have a distinct paging daemon,
> > then the pgrrun, rev, and scan probes aren't suited for transfer).  On the
> > other hand, there may be aspects on the FreeBSD side which would be beneficial
> > to monitor, but for which Illumos does not define probes.
> 
> I agree.  FreeBSD does have a paging daemon, implemented in vm_pageout.c.
> 
> > Therefore, my third question is: how important is it for a vminfo provider
> > implementation in FreeBSD to hew closely to the Illumos implementation?  Would
> > it be acceptable to not transfer some probes that don't make sense and add
> > some new probes that do?  Documentation is obviously vital for any deviations,
> > and I will make darn sure to make it a central part of the work.
> 
> Having ported the ip/tcp/udp providers based on illumos documentation,
> and having gone through some effort to make them compatible, I'm fairly
> skeptical that it's important to maintain compatibility.  Most
> non-trivial D scripts that I've seen and written which use these
> providers will also make use of FBT probes here and there, so some
> porting work is needed regardless.  Based on that, and on the
> observations above, compatibility shouldn't be a priority IMHO.
> 
> > Any and all feedback is most appreciated.
> > 
> > Thanks.
> > 
> > pete
> > 
> > 
> > [1] https://illumos.org/books/dtrace/chp-vminfo.html
> > [2] https://lists.freebsd.org/pipermail/freebsd-dtrace/2014-April/000209.html
> > 
> > 
>