Abolishing sleeps in issignal()

Jeff Roberson jroberson at chesapeake.net
Mon Oct 8 14:37:25 PDT 2007


During the work on thread lock I observed that there is a significant 
amount of locking involved in our signal paths right now.  And these locks 
also show up contended in many workloads.  Furthermore, requiring a DEF 
mutex complicates sleep queues by forcing them to drop the spinlock to 
check for signals and then check for races.

The current issignal() code will actually msleep in the case of a 
stopevent() requested by the debugger.  This is fine for signals that 
would normally abort the sleep anyway, but SIGSTOP actually leaves the 
thread on the sleep queue and tries to resume the sleep after the stop has 
cleared.  So SIGSTOP combined with a stopevent() actually breaks because 
the stopevent() removes the thread from the sleep queue.  I'm not certain 
what the failure mode is currently, but I'm certain that it's wrong.

What I'd like to do is stop sleeping in issignal() all together.  For 
regular restartable syscalls this would mean failing back out to ast() 
where we'd then handle the signals including SIGSTOP.  After SIGCONT we'd 
then restart the syscall.  For non-restartable syscalls we could have a 
special issignal variant that is called when msleep/cv_timedwait_sig 
return interrupted that would check for SIGSTOP/debugger events and sleep 
within a loop retrying the operation.  This would preserve the behavior of 
debugging events and SIGSTOP not aborting non-restartable syscalls as they 
do now.

Once we have moved the location of the sleeps it will be possible to check 
for signals using a spinlock without dropping the sleep queue lock in 
sleepq_catch_signals().

What I'd like from readers on arch@ is for you to consider if there are 
other cases than non-restartable syscalls that will break if 
msleep/sleepqs return EINTR from SIGSTOP and debug events.  Also, is there 
an authoritative list of non-restartable syscalls anywhere?  It's just 
those involving timevals right?  nanosleep/poll/select/kqueue.. others?

I intend to do this work for 8.0 and hopefully very early on so we have 
plenty of time to shake out bugs as this signal code tends to be very 
delicate.

Thanks,
Jeff


More information about the freebsd-arch mailing list