Re: A new boot-time trace framework

From: Mitchell Horne <mhorne_at_freebsd.org>
Date: Mon, 07 Feb 2022 16:40:39 UTC
On 11/10/21 13:39, Bjoern A. Zeeb wrote:
> On 10 Nov 2021, at 16:26, Mitchell Horne wrote:
> 
>> Unlike TSLOG, I intend for this work to be compiled in to the kernel 
>> by default, but disabled behind a tunable (kern.boottrace.enabled). 
>> The cost of doing so should be minimal, only a couple of syscalls 
>> added to init(8) at most.
> 

Hi, apologies for my delayed reply here.

> I think if you really want to have this on by default (whether that make 
> sense or not for the majority of people) I’d at least avoid the function 
> call and reduce it to a branch which is super-easy to do.
> 

Good suggestion, the latest version has this change.

> My honest feeling is that another of the at least 3 other tracing 
> mechanisms existing these days be better extended and improved rather 
> than another one added;  we were always joking about 3 firewalls but if 
> we keep going this path we can soon start joking about 9 tracing 
> mechanisms and that will be a major mess for sysadmins.  I can see from 
> when this work was coming and back then it might have made sense this 
> way; but more than a decade has passed..
> 

Thanks for your input. In general, I agree with you; it is preferable to 
extend existing mechanisms than add something new with duplicated 
functionality, and I tend to look for this option first. Conversely, I 
feel that a one-size-fits all tracing framework is not possible, as what 
data is captured and how it is presented is highly dependent on the 
meaning/nature of that data. Put differently, we will always require 
more than one tracing mechanism for different types of tracing tasks.

In reviewing the existing tracing options, I did not find that building 
this functionality on top of them to be any easier or more desirable.

TSLOG covers much of the same area but with a different notion of 
tracing (i.e. recursive event tracing based around function entry/exit 
as opposed to one-shot events). The intended consumer of its data is a 
set of python scripts, so it is presented much differently than the 
human-readable log of events that boottrace produces. KTR has a KTR_INIT 
class that appears completely unused and comes close to realizing the 
same purpose, but would still require the addition of a whole new 
interface for creating new ktr events from userspace. Rather than bend 
these existing tools to meet these needs it seems better to me to leave 
them to do what they are good at, and instead add something new that is 
purpose-built and equally limited in scope (for which the work is 
already complete).

> /bz

On another note, it was misleading of me to call boottrace a 
'framework', as this implies a much wider scope than what it achieves. 
More accurate would be to call it a facility.

I intend to move forward and commit this work later this week if there 
are no lingering comments.

Cheers,
Mitchell