valgrind on amd64 crashes when delivering signal for threaded application

Thu Apr 24 03:33:35 UTC 2014

On Wed, 23 Apr 2014, Mikolaj Golub wrote:

> I am observing an issue with valgrind on amd64 CURRENT or 10, when it

I cannot remember whether we changed the stack alignment on one or both of 
i386 and amd64 when we switched to clang; I think we did, but am having 
trouble finding it in the archives.  Though, I think it would have been to 
match what clang does by default on linux, which would not really help 
explain the weird behavior from valgrind.

> I tracked it to r249423 (import of clang 3.3), which optimizes
> this statement in the signal handler wrapper from thr_sig.c:
>
>  static void
>  thr_sighandler(int sig, siginfo_t *info, void *_ucp)
>  {
>  	...
>  	struct sigaction act;
>  	...
>  	act = _thr_sigact[sig-1].sigact;
>
> into a sequence of movups/movaps instructions:
>
>   0x000000000000dc2f <+79>:    movups (%r14,%r15,1),%xmm0
>   0x000000000000dc34 <+84>:    movups 0x10(%r14,%r15,1),%xmm1
>   0x000000000000dc3a <+90>:    movaps %xmm1,-0x40(%rbp)
>   0x000000000000dc3e <+94>:    movaps %xmm0,-0x50(%rbp)
>
> I have lost in valgrind signal handling details, but apparently the
> frame for thr_sighandler() is misaligned when running by valgrind and
> as a result the movaps operand (the destination of act local variable)
> is not aligned on a 16-byte boundary.
>
> The prblem may be workarounded either by compiling thr_sig.c without
> optimization or replacing the assignment by bcopy().
>
> Also, changing the alignment of the sigframe the valgrind pushes on
> the stack when delivering a signal to 8 bytes fixes the issue:
>
>  --- coregrind/m_sigframe/sigframe-amd64-freebsd.c.orig  2014-04-23 22:39:45.000000000 +0300
>  +++ coregrind/m_sigframe/sigframe-amd64-freebsd.c       2014-04-23 22:40:23.000000000 +0300
>  @@ -250,7 +250,7 @@ static Addr build_sigframe(ThreadState *
>      UWord err;
>
>      rsp -= sizeof(*frame);
>  -   rsp = VG_ROUNDDN(rsp, 16);
>  +   rsp = VG_ROUNDDN(rsp, 16) - 8;

I would expect that the fact that this patch fixes the observed crash 
means that valgrind has a bug when setting up the stack for the signal 
handler.  I had to work around an apparently similar bug in the built-in 
lightweight thread implementation in net/openafs by forcing 
-mstack-realign to be used for its compilation (because analyzing the 
lightweight threads implementation when upstream is trying to switch to 
pthreads is not worth the effort).  I guess here the thing to try would be 
compiling libthr with -mstack-realign, not that that is a reasonable thing 
to do in head.  Perhaps the valgrind upstream should be asked about the 
details of the stack creation?

-Ben

>      frame = (struct sigframe *)rsp;
>
>      if (!extend(tst, rsp, sizeof(*frame)))
>
> Unfortunately, I have poor understanding of valgrind internals and
> what is going on exactly when it delivers a signal to the process, so
> failed to find a proper fix.