svn commit: r310423 - head/sys/kern

John Baldwin jhb at freebsd.org
Fri Dec 23 03:40:14 UTC 2016


On Thursday, December 22, 2016 11:26:01 AM Mark Johnston wrote:
> On Thu, Dec 22, 2016 at 10:39:12AM -0800, John Baldwin wrote:
> > On Thursday, December 22, 2016 05:51:44 PM Mark Johnston wrote:
> > > Author: markj
> > > Date: Thu Dec 22 17:51:44 2016
> > > New Revision: 310423
> > > URL: https://svnweb.freebsd.org/changeset/base/310423
> > > 
> > > Log:
> > >   Revert part of r300109.
> > >   
> > >   The removal of TAILQ_FOREACH_SAFE introduced a small race: when the last
> > >   thread on a sleepqueue is awoken, it reclaims the sleepqueue and may begin
> > >   executing on a different CPU before sleepq_resume_thread() returns. This
> > >   leaves a window during which it may go back to sleep and incorrectly be
> > >   awoken again by the caller of sleepq_broadcast().
> > 
> > This is very subtle.  
> 
> :(

That also means debugging this was a nice catch. :)

> > The issue is that the last sleepq_resume_thread transfers
> > ownership of 'sq' from the wait channel that the sleepq_broadcast has locked,
> > to the thread being resumed.  
> 
> Right, that's what I meant by "reclaims the sleepqueue." One other
> requirement for hitting the race is that the thread goes back to sleep
> on a wait channel that hashes to a different sleepchain, else the
> sleepchain lock held by the sleepq_broadcast() caller is, I believe,
> sufficient to prevent the reuse of the sleepqueue before the loop has
> terminated.
> 
> > I thought about using a local TAILQ_HEAD and
> > using TAILQ_CONCAT to move the list of threads out of the sleep queue and then
> > walking that list.  However, a comment explaining this transfer of ownership
> > (and that we can't safely access 'sq' after the last thread is resumed) is
> > probably sufficient (but necessary I think).  Do you feel like adding one?
> 
> How about:
> 
> Index: subr_sleepqueue.c
> ===================================================================
> --- subr_sleepqueue.c	(revision 310423)
> +++ subr_sleepqueue.c	(working copy)
> @@ -892,7 +892,12 @@
>  	KASSERT(sq->sq_type == (flags & SLEEPQ_TYPE),
>  	    ("%s: mismatch between sleep/wakeup and cv_*", __func__));
>  
> -	/* Resume all blocked threads on the sleep queue. */
> +	/*
> +	 * Resume all blocked threads on the sleep queue.  The last thread will
> +	 * be given ownership of sq and may re-enqueue itself before
> +	 * sleepq_resume_thread() returns, so we must cache the "next" queue
> +	 * item at the beginning of the final iteration.
> +	 */
>  	wakeup_swapper = 0;
>  	TAILQ_FOREACH_SAFE(td, &sq->sq_blocked[queue], td_slpq, tdn) {
>  		thread_lock(td);

That looks great, thanks!

-- 
John Baldwin


More information about the svn-src-head mailing list