svn commit: r234952 - in head/sys: kern sys

Thu May 3 22:24:26 UTC 2012

2012/5/3 Attilio Rao <attilio at freebsd.org>:
> 2012/5/3 Konstantin Belousov <kostikbel at gmail.com>:
>> On Thu, May 03, 2012 at 10:06:53PM +0100, Attilio Rao wrote:
>>> 2012/5/3 Konstantin Belousov <kostikbel at gmail.com>:
>>> > On Thu, May 03, 2012 at 02:14:20PM +0100, Attilio Rao wrote:
>>> >> 2012/5/3, Konstantin Belousov <kostikbel at gmail.com>:
>>> >> > On Thu, May 03, 2012 at 12:02:08PM +0100, Attilio Rao wrote:
>>> >> >> 2012/5/3, Konstantin Belousov <kib at freebsd.org>:
>>> >> >> > Author: kib
>>> >> >> > Date: Thu May  3 10:38:02 2012
>>> >> >> > New Revision: 234952
>>> >> >> > URL: http://svn.freebsd.org/changeset/base/234952
>>> >> >> >
>>> >> >> > Log:
>>> >> >> >   When callout_reset_on() cannot immediately migrate a callout since it
>>> >> >> >   is running on other cpu, the CALLOUT_PENDING flag is temporarily
>>> >> >> >   cleared. Then, callout_stop() on this, in fact active, callout fails
>>> >> >> >   because CALLOUT_PENDING is not set, and callout_stop() returns 0.
>>> >> >> >
>>> >> >> >   Now, in sleepq_check_timeout(), the failed callout_stop() causes the
>>> >> >> >   sleepq code to execute mi_switch() without even setting the wmesg,
>>> >> >> >   since the switch-out is supposed to be transient. In fact, the thread
>>> >> >> >   is put off the CPU for full timeout interval, instead of being put on
>>> >> >> >   runq immediately.  Until timeout fires, the process is unkillable for
>>> >> >> >   obvious reasons.
>>> >> >> >
>>> >> >> >   Fix this by marking the migrating callouts with CALLOUT_DFRMIGRATION
>>> >> >> >   flag. The flag is cleared by callout_stop_safe() when the function
>>> >> >> >   detects a migration, besides returning the success. The softclock()
>>> >> >> >   rechecks the flag for migrating callout and cancels its execution if
>>> >> >> >   the flag was cleared meantime.
>>> >> >>
>>> >> >> Can you please clarify why you cannot simply drop the deferred
>>> >> >> migration in the case !CALLOUT_PENDING in callout_stop_safe()?
>>> >> >
>>> >> > I probably can, I think I went with the route of committed patch
>>> >> > because it is slightly less work. Also, the comment in the while()
>>> >> > loop suggested me to rely on softclock.
>>> >>
>>> >> I don't think this is more work at all, the attached patch
>>> >> (pre-r234952, untested) should address it properly in few than 10
>>> >> lines:
>>> >> http://www.freebsd.org/~attilio/callout_cancel_mig_stop.patch
>>> >>
>>> >> without the need to add further flags and re-using existing mechanisms.
>>> >
>>> > (cc->cc_curr != c) is not the case which caused the issue. It might be
>>> > needed to treatened this way, but the reported case is opposite.
>>>
>>> Yes, of course, because the migration handover happens in the same
>>> critical context of cc->cc_curr == c, but now I wonder if this fix is
>>> really right.
>>>
>>> It seems to me that in the case you describe callout_stop() must
>>> return 0 and the migration must not be cancelled because the callout
>>> is not stopped. It is not stopped not because of the deferred
>>> migration but because cc->cc_curr == c. It seems a perfectly valid
>>> situation to me.
>> Yes, and my patch makes the callout to be indeed stopped right after
>> migration is finished. Did you looked at the patch itself ?
>>
>> What is the valid situation ? callout_stop returning 0 but not stopping
>> a pending callout ? I have to disagree.
>
> The function callout_stop() cancels a callout if it is currently pending.
>     If the callout is pending, then callout_stop() will return a non-zero
>     value.  If the callout is not set, has already been serviced or is cur‐
>     rently being serviced, then zero will be returned.  If the callout has an
>     associated mutex, then that mutex must be held when this function is
>     called.
>
> [ From the callout manpage ]
>
> If the "callout is currently being serviced" means cc->cc_curr == c
> and it must return 0.

Elaborating some more, I see a discrepancy here in the callout
interface, which is also present pre-your patch and pre-migration
delay.

Basically, immagine a callout rearmed during its callback (pretty
typical) and a callout_stop() running just after the callout has been
rearmed and it is *still* in the callback.
What we find is that CALLOUT_PENDING is on and that cc_curr == cc. I
don't think that the callout should stop successfully in this case.
However, because of how _callout_stop_safe() is written,
CALLOUT_PENDING check has precedence and wins, returning 1 and
removing the CALLOUT_PENDING flag, but please note that the callback
is still running (even if only for little time).

I think this generally works ok because most of the callout callbacks
rearm the callout as last thing in their operation. But I think this
is highly fragile and we cannot really rely on this feature.

You are seeing a problem in the deferred migration case because it
does the other way around, it prefers the check over cc_curr == cc to
the "pending" (callout migration in this case).

I think this is only a problem, also, with callouts which don't have a
lock associated with them, like the sleepqueue case, because otherwise
the interlock would leave the state consistent.

I think we need to think carefully about a pattern for this case that
deals with all the races, I need to give this more thinking, but
definitively it seems to me we need a patch at the callout policy
level.

Attilio

-- 
Peace can only be achieved by understanding - A. Einstein