Re: vminfo provider for FreeBSD
- Reply: Peter Johnson : "Re: vminfo provider for FreeBSD"
- In reply to: Peter Johnson : "Re: vminfo provider for FreeBSD"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Tue, 07 Jun 2022 17:17:06 UTC
On Mon, Jun 06, 2022 at 12:21:42PM -0400, Peter Johnson wrote: > Thanks for the detailed reply. > > I'm coming at this from a bit of a weird angle: I use FreeBSD as the basis for > my operating systems course, part of which asks students to write programs > that exercise operating system functionalities like scheduling and virtual > memory in specific ways and then use DTrace to confirm that their programs are > doing what they're supposed to do vis-a-vis those functionalities (eg, "write > a program that induces swapping and a D script that proves it works"). Cool! > Currently, the best way I've figured out for them to do this in the context of > virtual memory is to use vmstat(1), but it would be nice to have those > statistics at a process granularity. That makes sense. 30 minutes ago I was wishing I could check whether a given process took the "optimized COW fault" path in vm_fault.c (see the v_cow_optim counter). vmstat -s doesn't give a reliable answer, merely running that command itself causes the counter to increment. > The Illumos documentation on the vminfo provider itself suggests as a use case > getting more fine-grained information than the Illumos implementation of > vmstat makes available [1]---"more fine-grained" meaning both "per process > statistics" and, eg, "more information about individual faults". > > Given the reservations you note/confirm (potential onerous overhead of SDT > probes in VM code, lack of clear mapping between Illumos probes and FreeBSD > codebase, scalability of arg1 especially wrt SMP systems, complexity of > per-domain NUMA stats) I propose the following as a first step: > > Implement probes that fire whenever values in the "page" category of > vmstat(1) output change; that is: a page fault occurs (flt), a page is > reactivated (re), a page is paged in (pi), a page is paged out (po), a > page is freed (fr), a page is scanned by the page daemon (sr). > > I am unfamiliar with the codebase, but it seems likely to me that all of those > use counter(9), and so we would be able to correctly populate arg1. Most of them use counter(9), yes. "sr" is a bit more complicated. Basically, there is a counter in each page queue (PQ_{ACTIVE,INACTIVE,LAUNDRY} times the number of NUMA domains) which is updated once per "batch" of scanned pages. The global "sr" value is computed on demand by summing the per-pagequeue counters. > This would be a very modest amount of work (at least relative to transferring > the entire vmfino provider as it exists in Illumos) and give a starting point > for measuring SDT overhead. Yep, that sounds perfectly reasonable. > Once those proposed probes are in place, we can decide whether to implement > other probes from the Illumos set, add new probes that we determine useful, > optimize the SDT implementation, address SMP or NUMA considerations, etc. > > Thoughts? This makes sense to me. The other thing we might consider is whether it's worth including additional arguments (e.g., the physical vm_page_t) in some cases. That could always be added later though. > pete > > [1] https://illumos.org/books/dtrace/chp-vminfo.html#chp-vminfo-3 > > On Fri, Jun 03, 2022 at 03:47:31PM -0400, Mark Johnston wrote: > > On Thu, Jun 02, 2022 at 01:08:27PM -0400, Peter Johnson wrote: > > > Hi there -- > > > > > > I would find the probes in Illumos' vminfo provider [1] really handy to have > > > in FreeBSD and I'm happy to do the work to make it happen. The only > > > FreeBSD-related mention of the vminfo provider I can find is an old mailing > > > list post [2] that I interpret to mean that the existing fbt probes aren't a > > > meaningful alternative (not to mention that using fbt probes effectively > > > requires more understanding of the source code than is perhaps desirable given > > > DTrace's intended purpose/audience). > > > > > > My first question is: would such an addition be welcome? I can make a more > > > detailed case for its inclusion if that would be helpful/persuasive. > > > > I think it'd be welcome. My major reservation is that SDT probes have > > non-zero overhead even when disabled, especially on FreeBSD as currently > > implemented. The vminfo provider effectively adds a probe to various VM > > counter increments, which can occur very very frequently in some > > workloads, so I think we'd also want to > > 1) try to measure that overhead, perhaps using some micro-benchmarks, > > 2) possibly use the results to help motivate some long-overdue > > improvements to the SDT implementation. > > I'd be interested in helping with both of these. > > > > It'd be helpful to see an example or two demonstrating how the vminfo > > provider would be useful in diagnosing a particular problem. > > > > > If it is welcome, my plan would be to get very well-acquainted with FreeBSD's > > > VM subsystem, identify where each of the vminfo probes described in the > > > Illumos documentation should go, and then develop a patch to add those probes, > > > seeking feedback from both freebsd-dtrace folks and whichever group has > > > dominion over the VM stuff. > > > > > > My second question is: does this sound like a reasonable plan? It is, > > > admittedly, almost uselessly high level, but I expect I will need more than a > > > little familiarity with the codebase before I can get more specific. > > > > Looking through the provider documentation, I suspect it'll be difficult > > to implement some of the probes on FreeBSD, as you note below. For > > instance, I'm not sure that execfree can be implemented at all; FreeBSD > > doesn't have any (cheap) way to determine whether a given physical page > > belongs to an executable image. At least, I can't think of one. > > > > A second issue is in the description of "arg1" for vminfo probes. In > > FreeBSD, frequently-updated counters are implemented using counter(9), > > which provides per-CPU counters. To get the global value of such a > > counter, one must iterate over all per-CPU elements, summing them up. > > That's quite expensive and wasteful if you're doing it every time a > > vminfo probe fires. I'm not sure how best to deal with that problem. > > > > Yet another consideration is how one might expose per-NUMA domain > > counters. We could simply ignore that consideration and just provide > > global values, but per-domain info can be very useful. > > > > FreeBSD's VM system has a number of counters, exposed in various > > subtrees of the "vm" sysctl node. One might start by looking at the > > existing counters to see how closely they match vminfo probes, or simply > > define FreeBSD's vminfo provider in terms of the existing counters, > > possibly adding new ones. > > > > > Given the mailing list post I mentioned above, it seems possible that some of > > > the vminfo probes described in the Illumos documentation don't make sense in > > > the context of FreeBSD (eg, if FreeBSD doesn't have a distinct paging daemon, > > > then the pgrrun, rev, and scan probes aren't suited for transfer). On the > > > other hand, there may be aspects on the FreeBSD side which would be beneficial > > > to monitor, but for which Illumos does not define probes. > > > > I agree. FreeBSD does have a paging daemon, implemented in vm_pageout.c. > > > > > Therefore, my third question is: how important is it for a vminfo provider > > > implementation in FreeBSD to hew closely to the Illumos implementation? Would > > > it be acceptable to not transfer some probes that don't make sense and add > > > some new probes that do? Documentation is obviously vital for any deviations, > > > and I will make darn sure to make it a central part of the work. > > > > Having ported the ip/tcp/udp providers based on illumos documentation, > > and having gone through some effort to make them compatible, I'm fairly > > skeptical that it's important to maintain compatibility. Most > > non-trivial D scripts that I've seen and written which use these > > providers will also make use of FBT probes here and there, so some > > porting work is needed regardless. Based on that, and on the > > observations above, compatibility shouldn't be a priority IMHO. > > > > > Any and all feedback is most appreciated. > > > > > > Thanks. > > > > > > pete > > > > > > > > > [1] https://illumos.org/books/dtrace/chp-vminfo.html > > > [2] https://lists.freebsd.org/pipermail/freebsd-dtrace/2014-April/000209.html > > > > > > > >