request for review, callout fix.

Alfred Perlstein alfred at freebsd.org
Fri Mar 21 16:52:06 PDT 2008


* John Baldwin <jhb at freebsd.org> [080321 13:55] wrote:
> On Friday 21 March 2008 04:30:39 pm Alfred Perlstein wrote:
> > * John Baldwin <jhb at freebsd.org> [080321 12:08] wrote:
> > > On Friday 21 March 2008 12:32:25 am Alfred Perlstein wrote:
> > > > Hi guys, attached is a fix for the timeout/untimeout race with
> > > > Giant locked code.
> > > > 
> > > > Basically the old code would make the callout inaccessable
> > > > right before calling it inside of softclock.
> > > > 
> > > > However only the callout lock is held, so when switching to
> > > > the callout's associated mutex (in this case Giant) there's
> > > > a race where a "local" (timeout/untimeout) callout would be
> > > > fired even if stopped.
> > > > 
> > > > This patch fixes that.  We've run several hours of regression
> > > > testing on a version of this for 6.x.
> > > > 
> > > > People internal to Juniper and iedowse@ helped with this.
> > > > 
> > > > Please review/comment.
> > > 
> > > Curious as to how c->c_flags could change if CALLOUT_LOCAL_ALLOC is set?  
> > > Since it hasn't been enqueued on the callfree list, it isn't visible to 
> any 
> > > other code, so nothing should be able to mark it active or pending.  IOW, 
> you 
> > > should be able to do this in your second hunk:
> > > 
> > > 	if (c_flags & CALLOUT_LOCAL_ALLOC) {
> > > 		KASSERT(c->c_flags == CALLOUT_LOCAL_ALLOC, ("corrupted callout"));
> > > 		c->c_func = NULL;
> > > 		SLIST_INSERT_HEAD(...);
> > > 	}
> > 
> > It's more hairy than that.
> > 
> > Actually, I think you're right...
> > 
> > The confusion is the race for "callout_stop_safe",
> > but now that I think about it, the softclock will only
> > "grab" this callout if untimeout has NOT yet been called.
> >
> > If untimeout HAS been called, then softclock won't even see
> > the callout.
> 
> Yes.
> 
> > If untimeout HAS NOT been called and softclock grabs the
> > callout, it will have cleared CALLOUT_PENDING and then
> > untimeout (callout_stop_safe) will no longer free it.
> 
> Yes.
>  
> > Therefore it is safe to omit the check for flags as you
> > suggest.
> >
> > Is that right?
> 
> Yes, I believe so.  The tricky case is the race you are originally trying to 
> fix which is that the untimeout() comes in after the spin lock is dropped but 
> before Giant is acquired, but that falls under your second case above.

Thank you, I'll be committing later tonight.

-Alfred


More information about the freebsd-smp mailing list