signal handler priority issue

Sean McNeil sean at mcneil.com
Fri Jun 11 06:26:06 GMT 2004


On Thu, 2004-06-10 at 23:06, Daniel Eischen wrote:
> On Thu, 10 Jun 2004, Sean McNeil wrote:
> 
> > On Thu, 2004-06-10 at 22:29, Daniel Eischen wrote:
> > > > > 
> > > > > It can't keep going if there is a possibility that it can
> > > > > send the same thread another SIGUSR2.
> > > > 
> > > > I don't follow. Sorry.
> > > 
> > > If the master thread does:
> > > 
> > > 	for (i = 0; i < 4; i++) {
> > > 		pthread_kill(slave, SIGUSR1);
> > > 		sem_wait(&slave_semaphore);
> > > 		pthread_kill(slave, SIGUSR2);
> > > 	}
> > > 
> > > You can see that there is a potential race condition where
> > > the slave thread gets SIGUSR1 and SIGUSR2 very close together.
> > > It is even possible to get them together in one sigsuspend()
> > > (if they are both unmasked in the suspend mask).
> > > 
> > > You could fix the race by blocking SIGUSR1 from within
> > > the signal handler (like I described in my last email).
> > 
> > I take it then that when a signal handler is invoked that it's signal
> > isn't masked while running.  It isn't like a standard hardware interrupt
> > then.  I'm trying as you suggest and will post results.
> 
> Like I said before, it depends on the mask of the installed
> signal handler (sigact.sa_mask).  You should use sigaction()
> and not signal() to get the desired behavior.
> 
> You're other output looked strange.  I was expecting the
> "restart" count to start at 1, not 2.

That is my fault.  I didn't give enough output.  That count is how many
times world is stopped.  Here is all the output:

Stopping the world from 0x50d000
World stopped from 0x50d000
Pushing stacks from thread 0x50d000
Stack for thread 0x50d000 = [7fffffffdfa0,800000000000)
World starting
World started
About to start new thread from thread 0x50D000
Started thread 0x9D1400
Starting thread 0x9d1400
pid = 85636
sp = 0x7fffffeedf80
start_routine = 0x200db4960
Unable to locate tools.jar. Expected to find it in /usr/local/gcc-cvs/lib/tools.jar
Stopping the world from 0x50d000
Sending suspend signal to 0x9d1400
Suspending 0x9d1400
World stopped from 0x50d000
Pushing stacks from thread 0x50d000
Stack for thread 0x9d1400 = [7fffffeed94c,7fffffeee000)
Stack for thread 0x50d000 = [7fffffffcf00,800000000000)
World starting
Sending restart signal to 0x9d1400
World started
In GC_restart_handler for 0x9d1400
Waiting for restart #2 of 0x9d1400
Buildfile: build.xml
Stopping the world from 0x50d000
Sending suspend signal to 0x9d1400

This is with the following change to the code, by the way, that masks
everything but the (SIG_THR_RESTART) SIGUSR2 before calling sem_post. So
that didn't solve my problem:

    /* Wait until that thread tells us to restart by sending    */
    /* this thread a SIG_THR_RESTART signal.			*/
    /* SIG_THR_RESTART should be masked at this point.  Thus there	*/
    /* is no race.						*/
    if (sigfillset(&mask) != 0) ABORT("sigfillset() failed");
    if (sigdelset(&mask, SIG_THR_RESTART) != 0) ABORT("sigdelset()
failed");
#   ifdef NO_SIGNALS
      if (sigdelset(&mask, SIGINT) != 0) ABORT("sigdelset() failed");
      if (sigdelset(&mask, SIGQUIT) != 0) ABORT("sigdelset() failed");
      if (sigdelset(&mask, SIGTERM) != 0) ABORT("sigdelset() failed");
      if (sigdelset(&mask, SIGABRT) != 0) ABORT("sigdelset() failed");
#   endif

    pthread_sigmask(SIG_SETMASK, &mask, NULL);

    /* Tell the thread that wants to stop the world that this   */
    /* thread has been stopped.  Note that sem_post() is  	*/
    /* the only async-signal-safe primitive in LinuxThreads.    */
    sem_post(&GC_suspend_ack_sem);
    me -> stop_info.last_stop_count = my_stop_count;

#if DEBUG_THREADS
    GC_printf2("Waiting for restart #%d of 0x%lx\n", my_stop_count,
my_thread);
#endif

    do {
	    me->stop_info.signal = 0;
	    sigsuspend(&mask);             /* Wait for signal */
    } while (me->stop_info.signal != SIG_THR_RESTART);

Looks like I also missed some additional related code.  There is a
signal handler installed for SIGUSR2:

void GC_restart_handler(int sig)
{
    pthread_t my_thread = pthread_self();
    GC_thread me;

    if (sig != SIG_THR_RESTART) ABORT("Bad signal in suspend_handler");

    /* Let the GC_suspend_handler() know that we got a SIG_THR_RESTART.
*/
    /* The lookup here is safe, since I'm doing this on behalf  */
    /* of a thread which holds the allocation lock in order	*/
    /* to stop the world.  Thus concurrent modification of the	*/
    /* data structure is impossible.				*/
    me = GC_lookup_thread(my_thread);
    me->stop_info.signal = SIG_THR_RESTART;

    /*
    ** Note: even if we didn't do anything useful here,
    ** it would still be necessary to have a signal handler,
    ** rather than ignoring the signals, otherwise
    ** the signals will not be delivered at all, and
    ** will thus not interrupt the sigsuspend() above.
    */

#if DEBUG_THREADS
    GC_printf1("In GC_restart_handler for 0x%lx\n", pthread_self());
#endif
}




More information about the freebsd-threads mailing list