Re: git: ddf0ed09bd8f - main - sdt: Implement SDT probes using hot-patching

From: Ryan Libby <rlibby_at_gmail.com>
Date: Thu, 27 Jun 2024 02:27:53 UTC
On Mon, Jun 24, 2024 at 11:51 AM Mark Johnston <markj@freebsd.org> wrote:
>
> On Mon, Jun 24, 2024 at 09:27:55AM -0700, Ryan Libby wrote:
> > On Mon, Jun 24, 2024 at 9:06 AM Mark Johnston <markj@freebsd.org> wrote:
> > >
> > > On Mon, Jun 24, 2024 at 08:36:55AM -0700, Ryan Libby wrote:
> > > > On Wed, Jun 19, 2024 at 1:58 PM Mark Johnston <markj@freebsd.org> wrote:
> > > > >
> > > > > The branch main has been updated by markj:
> > > > >
> > > > > URL: https://cgit.FreeBSD.org/src/commit/?id=ddf0ed09bd8f83677407db36828aca2c10f419c9
> > > > >
> > > > > commit ddf0ed09bd8f83677407db36828aca2c10f419c9
> > > > > Author:     Mark Johnston <markj@FreeBSD.org>
> > > > > AuthorDate: 2024-06-19 20:57:09 +0000
> > > > > Commit:     Mark Johnston <markj@FreeBSD.org>
> > > > > CommitDate: 2024-06-19 20:57:41 +0000
> > > > >
> > > > >     sdt: Implement SDT probes using hot-patching
> > > > >
> > > > >     The idea here is to avoid a memory access and conditional branch per
> > > > >     probe site.  Instead, the probe is represented by an "unreachable"
> > > > >     unconditional function call.  asm goto is used to store the address of
> > > > >     the probe site (represented by a no-op sled) and the address of the
> > > > >     function call into a tracepoint record.  Each SDT probe carries a list
> > > > >     of tracepoints.
> > > >
> > > > Questions out of curiosity and maybe ignorance:
> > > >
> > > > How does this work with relocations?  Something must be adjusting these
> > > > addresses?
> > >
> > > The compiler handles this as part of the implementation of asm goto:
> > > the inline assembly can reference jump targets with "%l<index>" and
> > > they're specified as operands to the asm goto statement.  In the kernel
> > > these references are resolved statically, and kernel modules will
> > > contain relocations for the sdt_tracepoint_set section.
> > >
> > > > > +/*
> > > > > + * Work around an apparent clang bug or limitation which prevents the use of the
> > > > > + * "i" (immediate) constraint with the probe structure.
> > > > > + */
> > > > > +#define        _SDT_ASM_PROBE_CONSTRAINT       "Ws"
> > > > > +#define        _SDT_ASM_PROBE_OPERAND          "p"
> > > >
> > > > Is it because i386 kmods are built with -fPIC?
> > >
> > > I suspect that that's related, yeah.  The compiler might be assuming
> > > that some indirection is needed to compute the target address, but in
> > > this case it's an address in the same function and presumably can safely
> > > be assumed to be an immediate.
> > >
> >
> > That makes sense for the "%l1", does it also apply to the "%c0"?  Or
> > does use of "%c" for the probe pointer require non-PIC?  As in, don't
> > the _probes_ get relocated, and don't we need to patch the pointers to
> > the probes?
>
> When I use '%c0' to refer to the input operand, the intent is to insert
> the symbol name, not the address of the probe structure as computed by
> the compiler.  In an earlier iteration, there was no input operand and I
> just had something like
>
> __asm(
>   ...
>   ".quad " __STRING(_SDT_PROBE_NAME(...)) "\n"
>   ...);
>
> But this doesn't work when the probe symbol is local but has global
> linkage (i.e., it was defined with "static"), since we don't know what
> the symbol name is at compile time.  Hence the indirection, and I needed
> "c" to get clang to do what I want.  The assembler encounters SDT probe
> symbol names and emits relocations accordingly.  Maybe there's a better
> way to do what I want?  It seems that this doesn't work at all with gcc
> when -fPIC is defined.
>

Thanks for the exposition and background.

I think I'm still confused.  I haven't tried spinning up an i386 machine
yet which is probably the more reasonable next step, but I did page
through some disassembly.

Comparing disassembly of dtrace_test.ko with llvm-objdump -rD:

clang amd64:
0000000000000000 <set_sdt_tracepoint_set>:
       0: 00 00                         addb    %al, (%rax)
                0000000000000000:  R_X86_64_64  sdt_test___sdttest
       2: 00 00                         addb    %al, (%rax)
       4: 00 00                         addb    %al, (%rax)
       6: 00 00                         addb    %al, (%rax)
       8: 00 00                         addb    %al, (%rax)
                0000000000000008:  R_X86_64_64  .text+0x2f
       a: 00 00                         addb    %al, (%rax)
       c: 00 00                         addb    %al, (%rax)
       e: 00 00                         addb    %al, (%rax)
      10: 00 00                         addb    %al, (%rax)
                0000000000000010:  R_X86_64_64  .text+0x3b
                ...
      1e: 00 00                         addb    %al, (%rax)

Okay, that looks like relocations, which is what we want.

clang i386:
000028a0 <__start_set_sdt_tracepoint_set>:
    28a0: c4 28                         lesl    (%eax), %ebp
    28a2: 00 00                         addb    %al, (%eax)
    28a4: 21 18                         andl    %ebx, (%eax)
    28a6: 00 00                         addb    %al, (%eax)
    28a8: 2d 18 00 00 00                subl    $0x18, %eax
    28ad: 00 00                         addb    %al, (%eax)
    28af: 00                            <unknown>
...
000028c4 <sdt_test___sdttest>:
    28c4: 3c 00                         cmpb    $0x0, %al
    28c6: 00 00                         addb    %al, (%eax)
    28c8: b0 28                         movb    $0x28, %al
                ...

That looks like plain data?  It seems at 0x28a0 we have a pointer to
0x28c4, the address of the probe in the .ko.  But how do we know to fix
this up?  Looking at readelf -r, I don't see any reference to
sdt_test___sdttest like I do with the amd64.

Again, I might be confusing things, using the tools wrong, etc, this
isn't really my wheelhouse.  I'll try to get an i386 vm up soon to
improve my understanding.

Ryan