signal handler priority issue

Daniel Eischen eischen at vigrid.com
Fri Jun 11 05:29:57 GMT 2004


On Thu, 10 Jun 2004, Sean McNeil wrote:

> On Thu, 2004-06-10 at 21:55, Daniel Eischen wrote:
> > On Thu, 10 Jun 2004, Sean McNeil wrote:
> > 
> > > Here is what I see:
> > > 
> > > master thread calls pthread_kill with SIGUSR1 and waits on semaphore.
> > > other thread gets signal and calls sem_post.  It yields the scheduler.
> > 
> > This is fine as long as this thread doesn't get a signal
> > until after sem_post().  Being signal safe doesn't mean
> > that other threads can't be scheduled.
> > 
> > > master thread gets semaphore and continues on it's way.
> > > master thread calls pthread_kill with SIGUSR2 and keeps going.
> > 
> > It can't keep going if there is a possibility that it can
> > send the same thread another SIGUSR2.
> 
> I don't follow. Sorry.

If the master thread does:

	for (i = 0; i < 4; i++) {
		pthread_kill(slave, SIGUSR1);
		sem_wait(&slave_semaphore);
		pthread_kill(slave, SIGUSR2);
	}

You can see that there is a potential race condition where
the slave thread gets SIGUSR1 and SIGUSR2 very close together.
It is even possible to get them together in one sigsuspend()
(if they are both unmasked in the suspend mask).

You could fix the race by blocking SIGUSR1 from within
the signal handler (like I described in my last email).

> > > Later, master calls pthread_kill with SIGUSR1 and waits on semaphore.
> > > other thread gets signal and calls sem_post.  It yields the scheduler.
> > 
> > Why is it getting SIGUSR1?  It is waiting for SIGUSR2, not
> > SIGUSR1.  You need to mask SIGUSR1 before the sem_post() and
> > until after the sigsuspend() on SIGUSR2.
> 
> The problem is that it never gets to the sigsuspend.  It yields right
> after the sem_post and gets interrupted again by another SIGUSR1.  I see

Right, you need to block SIGUSR1 _before_ sem_post() and unblock
it after sigsuspend().

> this because it never prints a message that is following the sigsuspend
> and the sig hander count is incrementing showing me that it is called 2
> times before getting to the sigsuspend.

[ ... ]

> > Nope, that is allowable.  Please don't confuse signal safe with
> > "will not yield the CPU".  Also, on SMP, there are not guarantees
> > regardless!
> 
> OK.  Then I suppose what is happening is that it is losing the SIGUSR2. 
> But I don't really know why a signal handler would not be pushed to be
> high priority.  Doesn't it seem logical that it should keep the
> scheduler past a sem_post?  Perhaps that is the issue?  Should a thread
> inside a signal handler have highest priority?

It doesn't solve anything because it could still block on other
lower-level locks, and the posted thread could be run on another
CPU.

> > I'm not saying that there isn't a bug somewhere, but don't go
> > reading more into what the requirements of sem_post() are.
> 
> I guess I read more into the comment than I should:
> 
> 		/*
> 		 * sem_post() is required to be safe to call from within
> 		 * signal handlers.  Thus, we must enter a critical region.
> 		 */
> 
> I took this to mean (plus the critical enter/leave surrounding it) that
> sem_post should not yield the scheduler when inside a signal handler.

It tries hard not to leave the scheduler, but mostly it prevents
signals from getting delivered.  If the thread blocks on a low-level
lock (that used to protect a semaphore), then it can block.

It could be that semaphores aren't working correctly, but
the fact that you can yield the CPU isn't the real problem.

-- 
Dan Eischen



More information about the freebsd-threads mailing list