Problem with signal 0 being delivered to SIGUSR1 handler

David Xu davidxu at freebsd.org
Fri Nov 22 03:55:55 UTC 2013


On 2013/11/22 05:15, Konstantin Belousov wrote:
> On Thu, Nov 21, 2013 at 02:39:02PM +0200, Vitaly Magerya wrote:
>> Hi, folks. I'm investigating a test case failure that devel/boehm-gc
>> has on recent FreeBSD releases. The problem is that a signal
>> handler registered for SIGUSR1 is sometimes called with signum=0,
>> which should not be possible under any conditions.
>>
>> Here's a simple test case that demonstrates this behavior:
>>
>> /* Compile with 'c99 -o example example.c -pthread'
>>   */
>> #include <pthread.h>
>> #include <signal.h>
>> #include <stdio.h>
>> #include <stdlib.h>
>>
>> void signal_handler(int signum, siginfo_t *si, void *context) {
>>      if (signum != SIGUSR1) {
>>          printf("bad signal, signum=%d\n", signum);
>>          exit(1);
>>      }
>> }
>>
>> void *thread_func(void *arg) {
>>      return arg;
>> }
>>
>> int main(void) {
>>      struct sigaction sa = { 0 };
>>      sa.sa_flags = SA_SIGINFO;
>>      sa.sa_sigaction = signal_handler;
>>      if (sigfillset(&sa.sa_mask) != 0) abort();
>>      if (sigaction(SIGUSR1, &sa, NULL) != 0) abort();
>>      for (int i = 0; i < 10000; i++) {
>>          pthread_t t;
>>          pthread_create(&t, NULL, thread_func, NULL);
>>          pthread_kill(t, SIGUSR1);
> Side note.  pthread_kill(3) call behaviour is undefined if pthread_create(3)
> in the line before failed.
>
>>      }
>>      return 0;
>> }
>>
>> Under FreeBSD 9.2-RELEASE amd64 I pretty consistently get
>> "signum=0" from this program, but you may need to run it a few
>> times or increase the number of iterations to see the same.
>>
>> Interestingly enough, I don't see this behavior under 9.0-RELEASE.
>>
>> So, any ideas what the problem here is?
>
> It happens when libthr deferred signal handling path is taken for signal
> delivery and for some reason the code inside the deferred path called
> into rtld for symbol binding. Than, rtld lock is locked, some code in
> rtld is executed, and rtld lock is unlocked. Unlock causes _thr_ast()
> run, which results in the nested check_deferred_signal() execution.
> The check_deferred_signal() clearks si_signo, so on return the same
> signal is delivered one more time, but is advertized as signo zero.
>
> The _thr_rtld_init() approach of doing dummy calls does not really work,
> since it is not practically possible to enumerate the symbols needed
> during signal delivery.
>
> My first attempt to fix this was to increment curthread->critical_count
> around the calls to check_* functions in the _thr_ast(), but it causes
> reverse problem of losing _thr_ast() runs on unlock.
>
> I ended up with the flag to indicate that deferred delivery is running,
> so check_deferred_signal() should avoid doing anything. A delicate
> moment is that user signal handler is allowed to modify the passed
> machine context to result the return from the signal handler to cause
> arbitrary jump, or just do longjmp(). For this case, I also clear the
> flag in thr_sighandler(), since kernel signal delivery means that nested
> delivery code should not run right now.
>
> Please try this.
>
> diff --git a/lib/libthr/thread/thr_private.h b/lib/libthr/thread/thr_private.h
> index 83a02b5..c6651cd 100644
> --- a/lib/libthr/thread/thr_private.h
> +++ b/lib/libthr/thread/thr_private.h
> @@ -433,6 +433,9 @@ struct pthread {
>   	/* the sigaction should be used for deferred signal. */
>   	struct sigaction	deferred_sigact;
>
> +	/* deferred signal delivery is performed, do not reenter. */
> +	int			deferred_run;
> +
>   	/* Force new thread to exit. */
>   	int			force_exit;
>
> diff --git a/lib/libthr/thread/thr_sig.c b/lib/libthr/thread/thr_sig.c
> index 415ddb0..57c9406 100644
> --- a/lib/libthr/thread/thr_sig.c
> +++ b/lib/libthr/thread/thr_sig.c
> @@ -162,6 +162,7 @@ thr_sighandler(int sig, siginfo_t *info, void *_ucp)
>   	act = _thr_sigact[sig-1].sigact;
>   	_thr_rwl_unlock(&_thr_sigact[sig-1].lock);
>   	errno = err;
> +	curthread->deferred_run = 0;
>
>   	/*
>   	 * if a thread is in critical region, for example it holds low level locks,
> @@ -320,14 +321,18 @@ check_deferred_signal(struct pthread *curthread)
>   	siginfo_t info;
>   	int uc_len;
>
> -	if (__predict_true(curthread->deferred_siginfo.si_signo == 0))
> +	if (__predict_true(curthread->deferred_siginfo.si_signo == 0 ||
> +	    curthread->deferred_run))
>   		return;
>
> +	curthread->deferred_run = 1;
>   	uc_len = __getcontextx_size();
>   	uc = alloca(uc_len);
>   	getcontext(uc);
> -	if (curthread->deferred_siginfo.si_signo == 0)
> +	if (curthread->deferred_siginfo.si_signo == 0) {
> +		curthread->deferred_run = 0;
>   		return;
> +	}
>   	__fillcontextx2((char *)uc);
>   	act = curthread->deferred_sigact;
>   	uc->uc_sigmask = curthread->deferred_sigmask;
>

The patch looks fine to me.



More information about the freebsd-threads mailing list