[Bug 273688] sysutils/pstack: does not work with Valgrind

From: <bugzilla-noreply_at_freebsd.org>
Date: Sun, 24 Sep 2023 10:26:21 UTC

--- Comment #2 from Paul Floyd <pjfloyd@wanadoo.fr> ---
Some more analysis

In first window, 

./vg-in-place sleep 100000

Tthat's my dev build of Valgrind, just 'valgrind' from pkd devel/valgrind-devel
should be fine as well.

In the second window, GDB test

(gdb) attach 48907
Attaching to process 48907
Reading symbols from
[Switching to LWP 104670 of process 48907]
vgModuleLocal_do_syscall_for_client_WRK () at
144        setc  0(%rsp)         /* stash returned carry flag */
(gdb) bt
#0  vgModuleLocal_do_syscall_for_client_WRK () at
#1  0x000000003819f27a in do_syscall_for_client (syscallno=240,
tst=0x1002024f10, syscall_mask=0x1002ca9e20) at m_syswrap/syswrap-main.c:368
#2  vgPlain_client_syscall (tid=tid@entry=1, trc=trc@entry=73) at
#3  0x000000003819b150 in handle_syscall (tid=tid@entry=1, trc=trc@entry=73) at
#4  0x0000000038199223 in vgPlain_scheduler (tid=tid@entry=1) at
#5  0x00000000381ab33c in thread_wrapper (tidW=1) at
#6  run_a_thread_NORETURN (tidW=1) at m_syswrap/syswrap-freebsd.c:166
#7  0x0000000000000000 in ?? ()

That's what I'd expect.

And lldb:
(lldb) attach 48907
This version of LLDB has no plugin for the language "assembler". Inspection of
frame variables will be limited.
Process 48907 stopped
* thread #1, name = 'memcheck-amd64-f', stop reason = signal SIGSTOP
    frame #0: 0x00000000381a03a6
memcheck-amd64-freebsd`vgModuleLocal_do_syscall_for_client_WRK at
   141        but hasn't been committed to RAX. */
   143     /* stack contents: 3 words for syscall above, plus our prologue */
-> 144     setc  0(%rsp)         /* stash returned carry flag */
   146     movq  -16(%rbp), %r11 /* r11 = VexGuestAMD64State * */
   147     movq  %rax, OFFSET_amd64_RAX(%r11)    /* save back to RAX */
Executable module set to
Architecture set to: x86_64-unknown-freebsd13.2.
(lldb) bt
* thread #1, name = 'memcheck-amd64-f', stop reason = signal SIGSTOP
  * frame #0: 0x00000000381a03a6
memcheck-amd64-freebsd`vgModuleLocal_do_syscall_for_client_WRK at
    frame #1: 0x000000003819f27a memcheck-amd64-freebsd`vgPlain_client_syscall
[inlined] do_syscall_for_client(syscallno=240, tst=0x0000001002024f10,
syscall_mask=0x0000001002ca9e20) at syswrap-main.c:368:10
    frame #2: 0x000000003819f232
memcheck-amd64-freebsd`vgPlain_client_syscall(tid=1, trc=73) at
    frame #3: 0x000000003819b150 memcheck-amd64-freebsd`handle_syscall(tid=1,
trc=73) at scheduler.c:1206:4
    frame #4: 0x0000000038199223
memcheck-amd64-freebsd`vgPlain_scheduler(tid=1) at scheduler.c:1552:3
    frame #5: 0x00000000381ab33c memcheck-amd64-freebsd`run_a_thread_NORETURN
[inlined] thread_wrapper(tidW=1) at syswrap-freebsd.c:112:10
    frame #6: 0x00000000381ab2c6
memcheck-amd64-freebsd`run_a_thread_NORETURN(tidW=1) at

Again, that's OK.

I've dowloaded the pstack source from github and built it.

In gdb, looking at elfFindSymbolByAddress I see that the address that pstack is
using is the same as the address I see when attaching gdb. Namely 0x381a03a6.

There is no .dynsym so elfFindSectionByName finds .symtab.

symStrings looks OK to me. The frst entry is nil, and after that there is

(gdb) x /s obj->fileData + shdrs[symSection->sh_link]->sh_offset+1
0x8027e5829:    "mc_leakcheck.c"

That matches what I see with objdump -t:

paulf> objdump -t
/usr/home/paulf/scratch/valgrind/memcheck/memcheck-amd64-freebsd | less         

/usr/home/paulf/scratch/valgrind/memcheck/memcheck-amd64-freebsd:     file
format elf64-x86-64-freebsd

0000000000000000 l    df *ABS*  0000000000000000 mc_leakcheck.c

And the function where it is sleeping is

00000000381a034c g       .text  0000000000000000

I've done some more debugging and I've seen one error.

@ -196,11 +198,23 @@ elfFindSymbolByAddress(struct ElfObject *obj, Elf_Addr
                    symSection->sh_offset + symSection->sh_size);

                for (; sym < endSym; sym++) {
-                       if ((type == STT_NOTYPE ||
+                       if ((ELF_ST_TYPE(sym->st_info) == STT_NOTYPE ||

elfFindSymbolByAddress is only ever called with type == STT_FUNC and so
STT_FUNC == STT_NOTYPE is always false and any function with type
STT_NOTYPE aren't processed. I suppose STT_NOTYPE also means a size of 0.

With the above change I get

  0x381a03a6 vgModuleLocal_do_syscall_for_client_WRK (3819f27a, 0, 1301, 0,
3812ddf8, 0) + 5a

but only the one line.

I need to so some more debugging of elfFindSymbolByAddress() to see why it's
not getting the full callstack.

You are receiving this mail because:
You are the assignee for the bug.