request for review, callout fix.
Alfred Perlstein
alfred at freebsd.org
Fri Mar 21 16:52:06 PDT 2008
* John Baldwin <jhb at freebsd.org> [080321 13:55] wrote:
> On Friday 21 March 2008 04:30:39 pm Alfred Perlstein wrote:
> > * John Baldwin <jhb at freebsd.org> [080321 12:08] wrote:
> > > On Friday 21 March 2008 12:32:25 am Alfred Perlstein wrote:
> > > > Hi guys, attached is a fix for the timeout/untimeout race with
> > > > Giant locked code.
> > > >
> > > > Basically the old code would make the callout inaccessable
> > > > right before calling it inside of softclock.
> > > >
> > > > However only the callout lock is held, so when switching to
> > > > the callout's associated mutex (in this case Giant) there's
> > > > a race where a "local" (timeout/untimeout) callout would be
> > > > fired even if stopped.
> > > >
> > > > This patch fixes that. We've run several hours of regression
> > > > testing on a version of this for 6.x.
> > > >
> > > > People internal to Juniper and iedowse@ helped with this.
> > > >
> > > > Please review/comment.
> > >
> > > Curious as to how c->c_flags could change if CALLOUT_LOCAL_ALLOC is set?
> > > Since it hasn't been enqueued on the callfree list, it isn't visible to
> any
> > > other code, so nothing should be able to mark it active or pending. IOW,
> you
> > > should be able to do this in your second hunk:
> > >
> > > if (c_flags & CALLOUT_LOCAL_ALLOC) {
> > > KASSERT(c->c_flags == CALLOUT_LOCAL_ALLOC, ("corrupted callout"));
> > > c->c_func = NULL;
> > > SLIST_INSERT_HEAD(...);
> > > }
> >
> > It's more hairy than that.
> >
> > Actually, I think you're right...
> >
> > The confusion is the race for "callout_stop_safe",
> > but now that I think about it, the softclock will only
> > "grab" this callout if untimeout has NOT yet been called.
> >
> > If untimeout HAS been called, then softclock won't even see
> > the callout.
>
> Yes.
>
> > If untimeout HAS NOT been called and softclock grabs the
> > callout, it will have cleared CALLOUT_PENDING and then
> > untimeout (callout_stop_safe) will no longer free it.
>
> Yes.
>
> > Therefore it is safe to omit the check for flags as you
> > suggest.
> >
> > Is that right?
>
> Yes, I believe so. The tricky case is the race you are originally trying to
> fix which is that the untimeout() comes in after the spin lock is dropped but
> before Giant is acquired, but that falls under your second case above.
Thank you, I'll be committing later tonight.
-Alfred
More information about the freebsd-smp
mailing list