signal handler priority issue
Sean McNeil
sean at mcneil.com
Fri Jun 11 06:26:06 GMT 2004
On Thu, 2004-06-10 at 23:06, Daniel Eischen wrote:
> On Thu, 10 Jun 2004, Sean McNeil wrote:
>
> > On Thu, 2004-06-10 at 22:29, Daniel Eischen wrote:
> > > > >
> > > > > It can't keep going if there is a possibility that it can
> > > > > send the same thread another SIGUSR2.
> > > >
> > > > I don't follow. Sorry.
> > >
> > > If the master thread does:
> > >
> > > for (i = 0; i < 4; i++) {
> > > pthread_kill(slave, SIGUSR1);
> > > sem_wait(&slave_semaphore);
> > > pthread_kill(slave, SIGUSR2);
> > > }
> > >
> > > You can see that there is a potential race condition where
> > > the slave thread gets SIGUSR1 and SIGUSR2 very close together.
> > > It is even possible to get them together in one sigsuspend()
> > > (if they are both unmasked in the suspend mask).
> > >
> > > You could fix the race by blocking SIGUSR1 from within
> > > the signal handler (like I described in my last email).
> >
> > I take it then that when a signal handler is invoked that it's signal
> > isn't masked while running. It isn't like a standard hardware interrupt
> > then. I'm trying as you suggest and will post results.
>
> Like I said before, it depends on the mask of the installed
> signal handler (sigact.sa_mask). You should use sigaction()
> and not signal() to get the desired behavior.
>
> You're other output looked strange. I was expecting the
> "restart" count to start at 1, not 2.
That is my fault. I didn't give enough output. That count is how many
times world is stopped. Here is all the output:
Stopping the world from 0x50d000
World stopped from 0x50d000
Pushing stacks from thread 0x50d000
Stack for thread 0x50d000 = [7fffffffdfa0,800000000000)
World starting
World started
About to start new thread from thread 0x50D000
Started thread 0x9D1400
Starting thread 0x9d1400
pid = 85636
sp = 0x7fffffeedf80
start_routine = 0x200db4960
Unable to locate tools.jar. Expected to find it in /usr/local/gcc-cvs/lib/tools.jar
Stopping the world from 0x50d000
Sending suspend signal to 0x9d1400
Suspending 0x9d1400
World stopped from 0x50d000
Pushing stacks from thread 0x50d000
Stack for thread 0x9d1400 = [7fffffeed94c,7fffffeee000)
Stack for thread 0x50d000 = [7fffffffcf00,800000000000)
World starting
Sending restart signal to 0x9d1400
World started
In GC_restart_handler for 0x9d1400
Waiting for restart #2 of 0x9d1400
Buildfile: build.xml
Stopping the world from 0x50d000
Sending suspend signal to 0x9d1400
This is with the following change to the code, by the way, that masks
everything but the (SIG_THR_RESTART) SIGUSR2 before calling sem_post. So
that didn't solve my problem:
/* Wait until that thread tells us to restart by sending */
/* this thread a SIG_THR_RESTART signal. */
/* SIG_THR_RESTART should be masked at this point. Thus there */
/* is no race. */
if (sigfillset(&mask) != 0) ABORT("sigfillset() failed");
if (sigdelset(&mask, SIG_THR_RESTART) != 0) ABORT("sigdelset()
failed");
# ifdef NO_SIGNALS
if (sigdelset(&mask, SIGINT) != 0) ABORT("sigdelset() failed");
if (sigdelset(&mask, SIGQUIT) != 0) ABORT("sigdelset() failed");
if (sigdelset(&mask, SIGTERM) != 0) ABORT("sigdelset() failed");
if (sigdelset(&mask, SIGABRT) != 0) ABORT("sigdelset() failed");
# endif
pthread_sigmask(SIG_SETMASK, &mask, NULL);
/* Tell the thread that wants to stop the world that this */
/* thread has been stopped. Note that sem_post() is */
/* the only async-signal-safe primitive in LinuxThreads. */
sem_post(&GC_suspend_ack_sem);
me -> stop_info.last_stop_count = my_stop_count;
#if DEBUG_THREADS
GC_printf2("Waiting for restart #%d of 0x%lx\n", my_stop_count,
my_thread);
#endif
do {
me->stop_info.signal = 0;
sigsuspend(&mask); /* Wait for signal */
} while (me->stop_info.signal != SIG_THR_RESTART);
Looks like I also missed some additional related code. There is a
signal handler installed for SIGUSR2:
void GC_restart_handler(int sig)
{
pthread_t my_thread = pthread_self();
GC_thread me;
if (sig != SIG_THR_RESTART) ABORT("Bad signal in suspend_handler");
/* Let the GC_suspend_handler() know that we got a SIG_THR_RESTART.
*/
/* The lookup here is safe, since I'm doing this on behalf */
/* of a thread which holds the allocation lock in order */
/* to stop the world. Thus concurrent modification of the */
/* data structure is impossible. */
me = GC_lookup_thread(my_thread);
me->stop_info.signal = SIG_THR_RESTART;
/*
** Note: even if we didn't do anything useful here,
** it would still be necessary to have a signal handler,
** rather than ignoring the signals, otherwise
** the signals will not be delivered at all, and
** will thus not interrupt the sigsuspend() above.
*/
#if DEBUG_THREADS
GC_printf1("In GC_restart_handler for 0x%lx\n", pthread_self());
#endif
}
More information about the freebsd-threads
mailing list