valgrind on amd64 crashes when delivering signal for threaded application
Benjamin Kaduk
kaduk at MIT.EDU
Thu Apr 24 03:33:35 UTC 2014
On Wed, 23 Apr 2014, Mikolaj Golub wrote:
> I am observing an issue with valgrind on amd64 CURRENT or 10, when it
I cannot remember whether we changed the stack alignment on one or both of
i386 and amd64 when we switched to clang; I think we did, but am having
trouble finding it in the archives. Though, I think it would have been to
match what clang does by default on linux, which would not really help
explain the weird behavior from valgrind.
> I tracked it to r249423 (import of clang 3.3), which optimizes
> this statement in the signal handler wrapper from thr_sig.c:
>
> static void
> thr_sighandler(int sig, siginfo_t *info, void *_ucp)
> {
> ...
> struct sigaction act;
> ...
> act = _thr_sigact[sig-1].sigact;
>
> into a sequence of movups/movaps instructions:
>
> 0x000000000000dc2f <+79>: movups (%r14,%r15,1),%xmm0
> 0x000000000000dc34 <+84>: movups 0x10(%r14,%r15,1),%xmm1
> 0x000000000000dc3a <+90>: movaps %xmm1,-0x40(%rbp)
> 0x000000000000dc3e <+94>: movaps %xmm0,-0x50(%rbp)
>
> I have lost in valgrind signal handling details, but apparently the
> frame for thr_sighandler() is misaligned when running by valgrind and
> as a result the movaps operand (the destination of act local variable)
> is not aligned on a 16-byte boundary.
>
> The prblem may be workarounded either by compiling thr_sig.c without
> optimization or replacing the assignment by bcopy().
>
> Also, changing the alignment of the sigframe the valgrind pushes on
> the stack when delivering a signal to 8 bytes fixes the issue:
>
> --- coregrind/m_sigframe/sigframe-amd64-freebsd.c.orig 2014-04-23 22:39:45.000000000 +0300
> +++ coregrind/m_sigframe/sigframe-amd64-freebsd.c 2014-04-23 22:40:23.000000000 +0300
> @@ -250,7 +250,7 @@ static Addr build_sigframe(ThreadState *
> UWord err;
>
> rsp -= sizeof(*frame);
> - rsp = VG_ROUNDDN(rsp, 16);
> + rsp = VG_ROUNDDN(rsp, 16) - 8;
I would expect that the fact that this patch fixes the observed crash
means that valgrind has a bug when setting up the stack for the signal
handler. I had to work around an apparently similar bug in the built-in
lightweight thread implementation in net/openafs by forcing
-mstack-realign to be used for its compilation (because analyzing the
lightweight threads implementation when upstream is trying to switch to
pthreads is not worth the effort). I guess here the thing to try would be
compiling libthr with -mstack-realign, not that that is a reasonable thing
to do in head. Perhaps the valgrind upstream should be asked about the
details of the stack creation?
-Ben
> frame = (struct sigframe *)rsp;
>
> if (!extend(tst, rsp, sizeof(*frame)))
>
> Unfortunately, I have poor understanding of valgrind internals and
> what is going on exactly when it delivers a signal to the process, so
> failed to find a proper fix.
More information about the freebsd-hackers
mailing list