Re: Dtrace, Rust, and LLVM13

From: Mark Johnston <markj_at_freebsd.org>
Date: Fri, 03 Jun 2022 17:09:40 UTC
On Thu, Jun 02, 2022 at 03:27:48PM -0600, Alan Somers wrote:
> On Thu, Jun 2, 2022 at 2:55 PM Mark Johnston <markj@freebsd.org> wrote:
> >
> > On Wed, Jun 01, 2022 at 09:15:28PM -0600, Alan Somers wrote:
> > > The best way to profile a Rust program on FreeBSD is with Brendan
> > > Gregg's Flamegraph[1].  This is based on dtrace's ustack() function.
> > > It used to work great.  However, Rust v1.56.0 is based on LLVM13[2]
> > > and now dtrace can't print user stacks anymore.  For example,
> > >
> > > With Rust 1.55.0
> > >               libc.so.7`__je_malloc_mutex_prefork+0x124
> > >               libc.so.7`__je_arena_prefork7+0x73
> > >               libc.so.7`_malloc_prefork+0x15b
> > >               libthr.so.3`0x392e4a8c4686
> > >               libthr.so.3`_fork+0x18
> > >               test-dad15ed382b075cf`nix::unistd::fork::h358225d652a86eab+0xe
> > >               test-dad15ed382b075cf`test::test_unistd::test_fork_and_waitpid::hb93c7cdf2b79d680+0x36
> > >               test-dad15ed382b075cf`test::test_unistd::test_fork_and_waitpid::_$u7b$$u7b$closure$u7d$$u7d$::h329a121974ff9291+0x11
> > >               test-dad15ed382b075cf`core::ops::function::FnOnce::call_once::h2261827bcba63036+0x11
> > >               test-dad15ed382b075cf`test::__rust_begin_short_backtrace::hefb7644d11da2ff9+0xa
> > >               test-dad15ed382b075cf`test::run_test::run_test_inner::_$u7b$$u7b$closure$u7d$$u7d$::hdaa0fb71aac8d97e+0x2f3
> > >               test-dad15ed382b075cf`std::sys_common::backtrace::__rust_begin_short_backtrace::h8bcc057a546c1087+0xce
> > >               test-dad15ed382b075cf`core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hf7d978d08be459d0+0x6a
> > >               test-dad15ed382b075cf`std::sys::unix::thread::Thread::new::thread_start::h6b52ca0eca213387+0x2b
> >
> > What are the identifiers at the end of each function?
> 
> You mean the hex number that comes before the offset?  I think the
> Rust compiler uses those to disambiguate closures, which would
> otherwise have the same name.  However, not all of these functions are
> closures.  Maybe Rust always generates that identifier anyway?

Ok.  Was just curious to know if there's something there that we should
be demangling, but it seems not.

> > >               libthr.so.3`0x392e4a8c3a7a
> > >
> > > With Rust 1.56.0
> > >               libc.so.7`__je_malloc_mutex_prefork+0x124
> > >               libc.so.7`__je_arena_prefork7+0x73
> > >               libc.so.7`_malloc_prefork+0x15b
> > >               libthr.so.3`0x1106cebc6686
> > >               libthr.so.3`_fork+0x18
> > >               test-b377ad62cc9e0624`nix::unistd::fork::hbf1ed55b658aa870+0xa
> > >               0x8
> > >               0xcccccccccccccccc
> > >
> > > See?  dtrace still prints the C part of the stack, but it only prints
> > > one or sometimes two frames of the Rust stack.
> >
> > ustack() unwinds the stack using the frame pointer saved in each stack
> > frame.  It'll fail to unwind code compiled without a frame pointer,
> > e.g., if -fomit-frame-pointer is specified to a C/C++ compiler.  For
> > this and similar reasons, we compile the base system with
> > -fno-omit-frame-pointer by default.
> >
> > > I'm not a compiler guy, so I don't know how to fix it.  I don't even
> > > know if the problem lies in Rust or dtrace.  Would any of you smart
> > > people be able to help here?  This is a pretty important feature for
> > > Rust development.
> >
> > I'd guess that Rust started omitting use of the frame pointer in
> > generated code.  This commit seems to indicate that that's the case:
> > https://github.com/rust-lang/rust/pull/48786/commits/5b800c231f45fcd823a3e958bb942cd620ceb3e0
> > Though, it's rather old.  I'm not sure why the problem appeared only
> > now.  So maybe the problem is elsewhere, but the commit log also
> > mentions a -Cforce-frame-pointers flag that you could try.
> >
> > While it's possible to unwind the stack without using a frame pointer, a
> > frame pointer makes doing so dead simple.  ustack() does the unwinding
> > in the kernel, in DTrace probe context, and text addresses are resolved
> > to symbol names in userspace.  So it's generally desirable to keep the
> > implementation simple.
> 
> That works!  Setting RUSTFLAGS="-C force-frame-pointers" allows dtrace
> to unwind the stacks in the resulting binaries.  However, it still
> doesn't work when I build a Rust shared library.  Any idea why that
> might be?

Doesn't work as in, the trace stops early like in your earlier example?
Or names don't get resolved as you'd expect?  I don't have any ideas
offhand, you could try looking at disassembly of the generated code to
see if a frame pointer is indeed maintained (generally I'd expect there
to be a "push %rbp" as the first instruction of a function symbol, and
"pop %rbp" before returning).  For instance:
https://rust.godbolt.org/z/WM7xP83WK