Re: schedgraph.d experience, per-CPU buffers, pipes

From: Ryan Stone <rysto32_at_gmail.com>
Date: Sat, 01 Jan 2022 18:44:04 UTC
I've definitely experienced the issue about different buffers rolling
over faster than others and producing confusing schedgraph data.  I'm
away from home this week, but I believe that I have a script that
tries to chop off the schedgraph data at the point where the most
recent CPU to roll over has no more data.  If I think of it I'll try
to pass it along.

Another issue that I remember encountering is that there is a
limitation on the total amount of space that you can allocate to
dtrace buffers and it does not scale to the number of CPUs in the
system, so the more CPUs that you have, the less buffer you can
allocate per CPU.  I seem to recall that the limit being very small
compared to the amount of memory on a modern large core count system.

On Fri, Dec 24, 2021 at 8:08 AM Andriy Gapon <avg@freebsd.org> wrote:
>
>
> I would like to share some experience or maybe rather a warning about using
> DTrace for tracing scheduling events.  Unlike KTR which has a global circular
> buffer, DTrace with bufpolicy=ring uses per-CPU circular buffers.  So, if there
> is an asymmetry in processor load, the buffers will fill and wrap-around at
> different speeds.  In the end, they might have approximately equal numbers of
> events but those may cover very different time intervals.  So, some additional
> post-processing is required to find the latest event among first ones of each
> per-CPU buffer.  Any traces from before that would have information gaps
> ("missing" processors) and would be very confusing.
>
> Also, I noticed that processes passing a lot of data through pipes produce a lot
> of scheduling events as they seem to get blocked and unlocked every few
> microseconds (on a modern performant system with the default pipe sizing
> configuration).  That contributes to a quick wrap-around of circular buffers.
>
> --
> Andriy Gapon
>